pcp
[Top] [All Lists]

Re: [pcp] Calculated/derived metrics?

To: "Frank Ch. Eigler" <fche@xxxxxxxxxx>, Marko Myllynen <myllynen@xxxxxxxxxx>
Subject: Re: [pcp] Calculated/derived metrics?
From: Ken McDonell <kenj@xxxxxxxxxxxxxxxx>
Date: Thu, 14 May 2015 07:01:54 +1000
Cc: pcp@xxxxxxxxxxx
Delivered-to: pcp@xxxxxxxxxxx
In-reply-to: <20150508181608.GA3195@xxxxxxxxxx>
References: <5534EBA8.4030509@xxxxxxxxxx> <1644393599.3651017.1429563442835.JavaMail.zimbra@xxxxxxxxxx> <55364606.1000503@xxxxxxxxxx> <55472B40.7050800@xxxxxxxxxx> <5547DE11.5050800@xxxxxxxxxxxxxxxx> <5549E4CD.5000408@xxxxxxxxxx> <554AFE4E.80000@xxxxxxxxxxxxxxxx> <y0mwq0kf7mm.fsf@xxxxxxxx> <554BEB16.7030208@xxxxxxxxxxxxxxxx> <554CC4ED.4090209@xxxxxxxxxx> <20150508181608.GA3195@xxxxxxxxxx>
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.6.0
Just reviewing this ... thanks for the suggestions Frank.

On 09/05/15 04:16, Frank Ch. Eigler wrote:
...
So you still want rate, but smoothed or over a longer time period.

As I understand Marko's requirement he'd like it smoothed over the life of the process, which is a special sort of smoothing ... now onto more general smoothing ...

... but a longer
term and more robust (overflow-aware etc.) way would be to have a
rate(metric,'10 minutes') or exponential_smooth(rate(metric),0.9)
types of things.

I prefer exponential_smooth() to 'rate over some time' because it requires less storage and because 'rate over time' really needs a rolling bucket implementation with (guess) sampling at x10 the "time" to avoid horrible discontinuities and get close the the expected semantics of the rate over the last (rather than some arbitrarily aligned) time period.

For a fixed time base both these averaging methods require some higher frequency fetching behind the scenes and there is no such thread of control in libpcp.

The current derived metrics implementation sits on the data path, so it is restricted to whatever extra fetching you can add in front of the pmFetch() and whatever calculation you can do after the pmFetch() with a small amount (last fetch data) for functions like rate().

We could consider exponential_smooth() to be the smoothed average over the values previously fetched by a client (as opposed to the more common semantics of average over samples collected at some time constant, e.g. sub-second sampling to produce the kernel's load average) ... the "time series" that is being averaged here would be the series of observations at the pmFetch() interface, not the series of observations at some fixed (independent and shorter) time interval ... that could work.

(Probably this and the pmie expression syntaxes should be unified.)

I'm not sure that is going to happen ... pmie's base is firmly rooted in first order predicate calculus and although it has some similarities in simple expression syntax, there are lots of differences in semantics and the size of their respective grammars.

Probably better to let pmie chew on expressions written in terms of derived metrics ... this is known to work and be very robust.

<Prev in Thread] Current Thread [Next in Thread>