pcp
[Top] [All Lists]

Re: [pcp] Derived Metrics with rate()

To: Ken McDonell <kenj@xxxxxxxxxxxxxxxx>, "'pcp developers'" <pcp@xxxxxxxxxxx>
Subject: Re: [pcp] Derived Metrics with rate()
From: Marko Myllynen <myllynen@xxxxxxxxxx>
Date: Mon, 29 Jun 2015 08:54:03 +0300
Delivered-to: pcp@xxxxxxxxxxx
In-reply-to: <00e801d0b124$8b825d60$a2871820$@internode.on.net>
Organization: Red Hat
References: <558BAC86.2090005@xxxxxxxxxx> <00e801d0b124$8b825d60$a2871820$@internode.on.net>
Reply-to: myllynen@xxxxxxxxxx
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.7.0
Hi,

On 2015-06-28 00:59, Ken McDonell wrote:
> 
> This is simpler for disk.dev.avrqsz:
>   disk.dev.avrqsz = 2 * rate(disk.dev.total_bytes) / rate(disk.dev.total)

oh, yeah, I mostly followed the formulas used in pcp-iostat.py so indeed
the non-total using version looks convoluted compared to that.

> (we already have metrics that return read+write bytes and read+write IOPs
> and putting the 2 * at the front is a little clearer IMHO to do the Kbytes
> -> blocks conversion, although I would argue that sar has got this wrong
> forever and the answer should be in bytes (or Kbytes) not "blocks")

Hmm, perhaps we should not repeat that mistake and instead use (K)bytes
(or even provide both).

> I don't think avqsz is correct.  read_rawactive and write_rawactive measure
> time during which  disk requests are being serviced  ... these are in units
> of milliseconds, so you need to divide by 1000 to get close to the sar
> numbers.

I couldn't figure out how to match exactly with sar numbers and this was
at least reacting the same way as sar's avgqu-sz (I should mentioned
this in my earlier email).

> BUT because there are so many ways to calculate the "average queue length"
> (this one is the time average, not the stochastic average) and most punters
> don't understand the differences I have always argued that this metric
> hinders, does not help, performance analysis.  See
> man/html/howto.diskperf.html (in the PCP source tree) for a much longer rant
> on this subject.

Thanks for the pointer, interesting rant :)

> The "library" of derived metrics is a work in progress ... I'm still not
> sure of the right way to do this.  For most users (maybe) there does not
> seem to be a case to warrant loading a bunch of derived metrics every time a
> PCP client is started, so I'm not sure a "library" that is always processed
> is the correct approach.  There is also an issue of potential name clashes
> between the derived metrics and the evolving PMDA metrics and indeed between
> the derived metrics themselves.

Agreed, off-by-default is the best approach.

Btw, is "library" a fitting term here or would something like
"collection", "set", or even "repertory" be better one, some people
might initially associate "library" with DSO which might cause some
confusion? (I'll certainly leave this up to you native speakers to
figure out.)

> My current thoughts are to extend $PCP_DERIVED_CONFIG to be a $PATH-style
> list, and then if a directory appears in the list, all the files in that
> directory will be processed as though they are derived metric specification
> files.  This is (a) backwards compatible, (b) optional, and (c) gives a
> short-hard way of naming a bunch of derived metric files.

Sounds very good.

Thanks,

-- 
Marko Myllynen

<Prev in Thread] Current Thread [Next in Thread>