pcp
[Top] [All Lists]

Re: Debugging sigpipe in pmda

To: Ken McDonell <kenj@xxxxxxxxxxxxxxxx>
Subject: Re: Debugging sigpipe in pmda
From: fche@xxxxxxxxxx (Frank Ch. Eigler)
Date: Tue, 16 Aug 2016 21:40:37 -0400
Cc: pcp@xxxxxxxxxxx
Delivered-to: pcp@xxxxxxxxxxx
In-reply-to: <83f3710f-d758-6f7d-d9af-480fb897f4c8@xxxxxxxxxxxxxxxx> (Ken McDonell's message of "Wed, 17 Aug 2016 07:23:28 +1000")
References: <df62753e-0d3d-3626-cd6e-ed1f8e17fd2e@xxxxxxx> <1831980510.1015515.1470956662271.JavaMail.zimbra@xxxxxxxxxx> <b735a150-5aa2-04f0-d9df-f4e8eb699c19@xxxxxxx> <y0m1t1ophga.fsf@xxxxxxxx> <83f3710f-d758-6f7d-d9af-480fb897f4c8@xxxxxxxxxxxxxxxx>
User-agent: Gnus/5.1008 (Gnus v5.10.8) Emacs/21.4 (gnu/linux)
Hi, Ken -

kenj wrote:

> [...]
> I agree on the strace for PMDA not pmcd comments.
>
>> ...
>> Come to think of it, there are few PMDAs that have NOT been hit by
>> this issue at some point.  I wonder if it's time that a more systemic
>> solution be invented (not just restarting timed-out pmdas).
>
> But I think this assertion is not correct ... there are in fact very
> few PMDAs that have hit this issue, specifically there are 81 PMDAs in
> the current source tree and very few of these have triggered PDU
> timeout issues for pmcd.  The most notable and long-standing cases are
> the DBMS PMDAs where SQL queries are used.

Sorry, I was not speaking of the complete census of 81.  Of the
handful of ones activated by default, plus a few activated by
curiosity, I've encountered timeouts for most of them: pmdalinux,
pmdaproc, pmdasystemd, pmdapapi, pmdarpm, probably even pmdammv, all
in living memory.


> And the "solution" is a standard one ...
> [...]

And I wonder if libpcp_pmda should automate it so pmdas don't have to
reinvent it.


> [...]  This [background-thread] approach reduces the quality of the
> data (in terms of timeliness) and adds overhead (the refreshing
> thread runs even if no client of pmcd is requesting the data).
> [...]

The second part of that need not be true.  The PMDA could be
responsive to clients asking for data, and start its poll thread only
during such times and even only at such observed fetching rates.


- FChE

<Prev in Thread] Current Thread [Next in Thread>