pcp
[Top] [All Lists]

pmcd dumping core - multiple issues

To: PCP Mailing List <pcp@xxxxxxxxxxx>
Subject: pmcd dumping core - multiple issues
From: Ken McDonell <kenj@xxxxxxxxxxxxxxxx>
Date: Sun, 28 Jul 2013 08:23:18 +1000
Delivered-to: pcp@xxxxxxxxxxx
User-agent: Mozilla/5.0 (X11; Linux i686; rv:17.0) Gecko/20130623 Thunderbird/17.0.7
I am seeing qa/183 failing across lots of hosts in a full run, i.e. $ check (no args)

On investigation I see the logger and trace PMDA installed (which is odd) and pmnewlog seems to be having trouble talking to pmlogger via pmlc to get info about the "logger" metrics ...

        Problem with lookup for metric "logger" ...
        Reason: No PMCD agent for domain of request

So in an attempt to diagnose this I tried to Remove the logger PMDA and this happened ...

[Sun Jul 28 07:08:46] pmcd(23023) Info: CleanupAgent ...
Cleanup "logger" agent (dom 106): unconfigured, exit(1)

->PMCD event trace: starting at Sun Jul 28 07:08:46 2013
->         New client: [1] -- unknown
?->         Xmit: ERROR PDU, fd=1028, err=0: No error
->         Recv: CREDS PDU, fd=1028, pdubuf=0xb8424000
->         Recv: CREDS PDU, fd=1028, pdubuf=0x1
->         Recv: PMNS_TRAVERSE PDU, fd=1028, pdubuf=0xb8422000
->         Xmit: PMNS_NAMES PDU, fd=1028, numpmid=1
->         Recv: PMNS_NAMES PDU, fd=1028, pdubuf=0xb8424000
->         Xmit: PMNS_IDS PDU, fd=1028, numpmid=1
->         Recv: PROFILE PDU, fd=1028, pdubuf=0xb8422000
->         Recv: FETCH PDU, fd=1028, pdubuf=0xb8424000
->         Xmit: RESULT PDU, fd=1028, numpmid=1
->         Recv: DESC_REQ PDU, fd=1028, pdubuf=0xb8422000
->         Xmit: DESC PDU, fd=1028, pmid=2.0.7
->         End client: fd=1028
->         Xmit: ERROR PDU, fd=10, err=-12391: Not Connected
->         Xmit: ERROR PDU, fd=12, err=-12391: Not Connected
->         Xmit: ERROR PDU, fd=16, err=-12391: Not Connected
->         Xmit: ERROR PDU, fd=18, err=-12391: Not Connected
->         Xmit: ERROR PDU, fd=20, err=-12391: Not Connected
->         Drop PMDA: domain=106, infd=16, outfd=17

[Sun Jul 28 07:08:46] pmcd(23023) Error: Unexpected signal 11 ...

Dumping to core ...

Now this is a non-negotiable release blocker.

pmcd is not allowed to dump core ... we're spent 10 years getting to this point, and we're going to keep it that way.

The New client message is also a worry -- unknown \n? is neither expected nor helpful.

And finally we've lost the procedure call traceback ... the relevant code is guarded by
#if HAVE_TRACE_BACK_STACK
but NOTHING appears to define HAVE_TRACE_BACK_STACK under any circumstances ... can anyone explain what happened here?

<Prev in Thread] Current Thread [Next in Thread>