pcp
[Top] [All Lists]

Re: [pcp] Floating point problem

To: Ken McDonell <kenj@xxxxxxxxxxxxxxxx>
Subject: Re: [pcp] Floating point problem
From: Martin Spier <mspier@xxxxxxxxxxx>
Date: Wed, 30 Jul 2014 16:51:47 -0700
Cc: Brendan Gregg <bgregg@xxxxxxxxxxx>, pcp@xxxxxxxxxxx, Amer Ather <aather@xxxxxxxxxxx>, Coburn Watson <cwatson@xxxxxxxxxxx>
Delivered-to: pcp@xxxxxxxxxxx
Dkim-signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=netflix.com; s=google; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=IV7A3/ffMeQJtY2N8DD9o9BV/ErCMoFc0igT+S4QF9o=; b=anhTHa+yknl2jXvtOoViH0ZQqppvyNnk7HM+y717cZPtRu9VUt5UczZ4eXW7Q4xVkh gF/2A7f4lznKRGbUft4gM/tpI7+VXoCtqdS3BpC5Olf5T4ANPoFmDwBjSzeNzC/07ff3 ocmvSyLeZ06SDBLgQejCSxyAw5G071tNcNOaY=
In-reply-to: <53D978ED.8010100@xxxxxxxxxxxxxxxx>
References: <CAEp4+dU2kE9JJztBPc=N5oSyoEyBvN5Of19rohC3DxXGeomuRw@xxxxxxxxxxxxxx> <033501cfa8a4$fd091ed0$f71b5c70$@internode.on.net> <CAEp4+dUH6fEQ2E=o5O2q8LKfR2xUypM-AeOwQhWy9sEntvO-AQ@xxxxxxxxxxxxxx> <53D6CE6A.8030309@xxxxxxxxxxxxxxxx> <CAJN39oi=_+NkcsKGcuVdiqT=m8cqoB0wF6v1xakSOPQmYdV4-g@xxxxxxxxxxxxxx> <53D978ED.8010100@xxxxxxxxxxxxxxxx>
My understanding was that the original metrics, kernel.all.cpu.user and kernel.all.cpu.sys suffered from the same problem. If that's not the case, I completely agree with option 2.


On Wed, Jul 30, 2014 at 3:59 PM, Ken McDonell <kenj@xxxxxxxxxxxxxxxx> wrote:
On 31/07/14 08:00, Brendan Gregg wrote:
..

I actually like 2. It's simple.

I'd like to not just fetch per-second metrics, but possibly other
intervals at the same time, including per-hour and per-day. And possibly
from multiple clients. And possibly ad-hoc queries. With 2, I simply
stash away whatever cumulative values and timestamp pairs, and use them
later when needed.

Excellent.

2. is more consistent with the PCP philosophy where rate conversion (which is but one form of aggregation) is preferably the domain of the clients consuming the performance data, rather than the producers of the data.

You'll need to stash the timestamps from the fetches also to do the rate conversion (or other temporal averaging) based on the consistent timestamps at the source of the data.

If you start to do serious aggregation, you'll also need the metadata to know if a metric is a counter or not and to allow correct scaling, e.g. to get the units of time for the CPU metrics, so you can scale as required before dividing by the delta in the timestamps.

This has all be done before (as Frank points out!), so lemme know if you need any assistance.


<Prev in Thread] Current Thread [Next in Thread>