pcp
[Top] [All Lists]

Re: [pcp] pmcd gets stuck with pmda kill

To: Ken McDonell <kenj@xxxxxxxxxxxxxxxx>, Nathan Scott <nathans@xxxxxxxxxx>
Subject: Re: [pcp] pmcd gets stuck with pmda kill
From: Martins Innus <minnus@xxxxxxxxxxx>
Date: Thu, 29 Jan 2015 14:54:54 -0500
Cc: pcp@xxxxxxxxxxx
Delivered-to: pcp@xxxxxxxxxxx
In-reply-to: <54C95A00.2060006@xxxxxxxxxxxxxxxx>
References: <54C7FF66.5090503@xxxxxxxxxxx> <1902595642.1770600.1422398645794.JavaMail.zimbra@xxxxxxxxxx> <54C9441E.4060302@xxxxxxxxxxxxxxxx> <54C946A7.3080002@xxxxxxxxxxx> <54C95A00.2060006@xxxxxxxxxxxxxxxx>
User-agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:31.0) Gecko/20100101 Thunderbird/31.4.0
Ken,

On 1/28/2015 4:52 PM, Ken McDonell wrote:

But is it valid to assume that, as a separate case, pmcd should continue
to function if a pmda gets "killed"  in some other way? OOM killer, some
other error?

Yes. Unless of course the OOM condition is so extreme to get pmcd killed.

The design point was that pmcd should continue to operate under extreme circumstances, and pmdas coming and going (for whatever reason) falls within that mantra.

If you have a counter example, I'd be interested to hear about it.

Yes, this would be my example posted that showed a backtrace with a problem in AgentsAttributes:

http://oss.sgi.com/archives/pcp/2015-01/msg00149.html

I muddled the issue because it seemed like a pmie problem. But basically the following occurs for me:

>killall -v pmdaproc
    Killed pmdaproc(23682) with signal 15

>pmval pmcd.agent.status
    pmval: pmLookupDesc: IPC protocol failure

>pmval hinv.ncpu
    pmval: pmLookupDesc: IPC protocol failure

>pminfo hinv.ncpu
    Error: hinv.ncpu: Broken pipe

The only thing that brings it back to life is a "pminfo", "pminfo proc", or if a pmlogger instance is logging a proc metric.

Martins


<Prev in Thread] Current Thread [Next in Thread>