Comment # 3
on bug 1158
from Ken McDonell
I've created an FSA to analyze the client side (pmwebd here) PDU traces.
kenj@bozo:~/Downloads$ client-pdu-fsa pmwebd.log
FSA failed: state=X-PMNS_IDS next_state=R-DESC @ line 773091
[10995]pmGetPDU: DESC fd=8 len=32 from=0
FSA failed: state=X-DESC_REQ next_state=R-PMNS_NAMES @ line 773095
[10995]pmGetPDU: PMNS_NAMES fd=8 len=48 from=0
FSA failed: state=X-PMNS_IDS next_state=R-DESC @ line 773104
[10995]pmGetPDU: DESC fd=8 len=32 from=0
FSA failed: state=X-DESC_REQ next_state=R-PMNS_NAMES @ line 773108
[10995]pmGetPDU: PMNS_NAMES fd=8 len=40 from=0
FSA failed: state=X-PMNS_IDS next_state=R-DESC @ line 773117
[10995]pmGetPDU: DESC fd=8 len=32 from=0
FSA failed: state=X-DESC_REQ next_state=R-PMNS_NAMES @ line 773121
[10995]pmGetPDU: PMNS_NAMES fd=8 len=52 from=0
I'll attach the script and the full report to the bug in a moment.
The interesting thing is that we've just received a RESULT at line 769861, and
then because the diags are enabled, we do a bucketload of requests to pmcd to
translate PMIDs to names, get pmDesc's, indom lookups ... all for the pmResult
dump at line 771093, but in that dump some of the metric names and instance
names were not retrieved from pmcd, see the <noname> and ??? lines starting at
772579.
The _after_ the pmResult dump, we see a scrambled mess of PDU interactions that
are out of order and relate to the pmResult dump, not pmwebd fetching metrics.
If pmcd was really overloaded at this point (the pmResult dumping certainly
would not help), then we'd be seeing all sorts of timeouts ... combine this
with asyncronous dead hand timers (I assume this is what the lines like this
[Tue Jul 26 20:29:30] pmwebd(10995): context (web1373406223=pm1) expired.
in the log mean) and I'm not sure what to expect.
What -D flags were set for pmwebd during this run?
If we're after an original error of an IPC failure (rather than trying to
induce one) then we don't need -Dfetch and -Dpdu will probably suffice.
If the experiment is reproducible, then it would also be helpful to see
pmcd.log with a pmcd instance running with -Dappl0,pdu along with the
pmwebd.log for pmwebd running with -Dpdu.