On 22/09/14 20:29, Nathan Scott wrote:
Hi all,
I'm seeing a new, reliable telnet-probe hang in qa/835 ... anyone else
come across this one? Haven't dug deeper yet, will do so tomorrow.
Hmm ... I'm not even getting to qa/835 now.
My last 3 QA runs are hung in qa/443 ... neither this test nor pmevent
have been subject to recent changes.
kenj@bozo-vm:~/src/pcp/qa$ pstree 23294
checkâââshâââpmevent
ââshâââsed
kenj@vm00:~/src$ pstree 12770
checkâââshâââpmevent
ââshâââsed
kenj@grundy:~$ pstree 27358
checkâââshâââpmevent
ââshâââsed
And here is the problem ... pmevent is not getting an error back for the
bad -h arg ... and loops forever using the local pmcd as a context.
kenj@bozo:~$ pmevent -h no.such.host sample.event.records
host: bozo
samples: all
sample.event.records[fungus]: 2 event records
08:15:48.916 --- event record [0] flags 0x1 (point) ---
sample.event.type 1
08:15:49.916 --- event record [1] flags 0x1 (point) ---
sample.event.type 2
sample.event.param_64 -3
sample.event.records[bogus]: 1 event records
08:15:58.916 --- event record [0] flags 0x1 (point) ---
sample.event.param_string "fetch #286"
sample.event.records[fungus]: 0 event records
sample.event.records[bogus]: 2 event records
08:15:59.919 --- event record [0] flags 0x1 (point) ---
sample.event.param_string "fetch #288"
08:15:59.919 --- event record [1] flags 0x1 (point) ---
sample.event.param_string "bingo!"
^C
And here is the root cause ...
kenj@bozo:~$ pmevent -Dcontext -h no.such.host sample.event.records
__pmSetSocketIPC: fd=3
IPC table fd(PDU version):
__pmDecodeXtendError: got error PDU (code=0, datum=385876226, version=2)
__pmSetVersionIPC: fd=3 version=2
IPC table fd(PDU version): 3(2,1)
__pmSendCreds: #0 = 1020000
__pmConnectPMCD(no.such.host): pmcd connection port=44321 fd=3 PDU version=2
IPC table fd(PDU version): 3(2,1)
pmNewContext(1, no.such.host) -> 0
Someone's broken pmNewContext() it appears to me.
|