pcp
[Top] [All Lists]

Re: [pcp] automatic derived metrics slowing down remote pcp clients, esp

To: "Frank Ch. Eigler" <fche@xxxxxxxxxx>, pcp developers <pcp@xxxxxxxxxxx>, mgoodwin@xxxxxxxxxx
Subject: Re: [pcp] automatic derived metrics slowing down remote pcp clients, esp. pmlogconf
From: Ken McDonell <kenj@xxxxxxxxxxxxxxxx>
Date: Sun, 15 May 2016 12:17:05 +1000
Delivered-to: pcp@xxxxxxxxxxx
In-reply-to: <20160514193945.GC1418@xxxxxxxxxx>
References: <20160514193945.GC1418@xxxxxxxxxx>
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.7.2
On 15/05/16 05:39, Frank Ch. Eigler wrote:
> Hi -
> 
> pmlogconf is used by service-pmlogger (intermittently) and
> service-pmmgr (frequently). ...

Therein lies a potential problem ... pmlogconf (or pmieconf) were only created 
to make the process of creating a pmlogger (or pmie) configuration file easier, 
and were never intended for frequent execution ... they are more in the "use 
once now and then" camp.

Each time you run pmlogconf, (a) it will probe a bunch of metrics that are 
unlikely to exist on the remote host (these are for the probe guards associated 
with the optional clauses), and (worse) (b) generate _exactly_ the same 
pmlogger configuration file as the last time pmlogconf was run (for this host).

Rate limitation on the use of pmlogconf (especially from service-pmmgr) would 
help, but not solve the issue you're seeing.

> ... It has recently gotten much much slower,
> and I finally figured out why.  It's the derived metrics processing.
> pmlogconf involves about a hundred pmprobe calls.  Each pmprobe is
> supposed to do just one fetch on a given metric to see if it exists.
> That should take only a couple of packets to the remote pmcd.

I'd be willing to assert that the probe guards in pmlogconf files should _not_ 
use derived metrics, and in this case running pmprobe(1) with 
PCP_DERIVED_CONFIG= pmprobe would work fine ... as proof

kenj@bozo:~$ pmprobe -Dpdu foo 2>&1 | grep XmitPDU | sed -e 's/ fd.*//' -e 
's/.*: //' | sort | uniq -c
      1 CREDS
     45 DESC_REQ
     63 PMNS_NAMES
      1 PMNS_TRAVERSE

kenj@bozo:~$ PCP_DERIVED_CONFIG= pmprobe -Dpdu foo 2>&1 | grep XmitPDU | sed -e 
's/ fd.*//' -e 's/.*: //' | sort | uniq -c
      1 CREDS
      1 PMNS_TRAVERSE

kenj@bozo:~$ cat /var/lib/pcp/config/derived/* | sed -e '/^#/d' | wc -l
18

kenj@bozo:~$ cat /var/lib/pcp/config/derived/* | sed -e '/^#/d' -e 's/ //g' -e 
's/^[^ ]*=//' | tr '[()+*/-]' '\012' | sed -e '/^$/d' -e '/^delta$/d' -e 
'/^rate$/d' | wc -l
45

So for me the 18 derived metrics and their 45 operand metrics in the "standard" 
derived metric configs account for an additional 108 PDU round trips ... and 
this happens for _every_ probe guard in the pmlogconf file, so ...

kenj@bozo:~$ grep -r '^probe' /var/lib/pcp/config/pmlogconf | wc -l
85

85 * 108 wasted PDU round trips.

> ...
> So, what to do?  Some options:
> 
> - nothing, bletch
> 
> - redefine pmlogconf to exclude derived metrics, and have it set
>    env PCP_DERIVED_CONFIG="" for itself / pmprobe

I'd suggest for the pmprobe executions as I suggested above ... this limits the 
scope of the change and minimizes any backwards compatibility fallout.
 
And document the restriction on pmlogconf "probe" guards.

> ...

<Prev in Thread] Current Thread [Next in Thread>