pcp
[Top] [All Lists]

Re: [pcp] python pmExtractValue segfault

To: Nathan Scott <nathans@xxxxxxxxxx>
Subject: Re: [pcp] python pmExtractValue segfault
From: Michele Baldessari <michele@xxxxxxxxxx>
Date: Thu, 29 May 2014 22:14:19 +0100
Cc: pcp@xxxxxxxxxxx
Delivered-to: pcp@xxxxxxxxxxx
Dkim-signature: v=1; a=rsa-sha256; c=relaxed/simple; d=acksyn.org; h= user-agent:in-reply-to:content-disposition:content-type :content-type:mime-version:references:message-id:subject:subject :from:from:date:date:received:received; s=2010; t=1401398061; bh=ok8DHMN4bly/7aJ9X6ygcRSdzpa/ql0j+HV2im0pEYk=; b=CoPdF5gDoTcl bleXbYyAgQStzN02uBY1cZYq8m/76bXAbCKVNNFYc3fe3x2RM+ryrykh/DUCDKuN ZmjOuuK7DooD0cV8DtWVTCSsmxiK4Ts1xJwPs4LLDyo+SGf/Gnof3+LulWUJH2XB /ngd9pF64nmQzlypY2N5WyITsP04Zqk=
In-reply-to: <827869715.16856930.1401326163702.JavaMail.zimbra@xxxxxxxxxx>
References: <20140527223044.GC4384@xxxxxxxxxxxxxxx> <1135934547.16110444.1401239312938.JavaMail.zimbra@xxxxxxxxxx> <20140528144433.GD4384@xxxxxxxxxxxxxxx> <827869715.16856930.1401326163702.JavaMail.zimbra@xxxxxxxxxx>
User-agent: Mutt/1.5.21 (2012-12-30)
Hi Nathan,

On Wed, May 28, 2014 at 09:16:03PM -0400, Nathan Scott wrote:
> > Since pmNameInDomArchive was quite high up in my profiling
> > I cached it in a dictionary so that indom_cache[(i, j)] = indom_name.
> 
> Nice.  If we look to begin merging these APIs, pcp.pmcc has a MetricCache
> class, with by-name, by-PMID, and by-indom dictionaries, with the latter
> providing the same optimisation you're describing here.
> 
> > This way I only look it up when the metric appears the first time, and
> > I shave off 40% of the time needed to parse this (the rest is dominated
> > by python casts and by pmExtractValue calls, for which there are less
> > obvious ways to improve). Is this a safe thing to do? Am I guaranteed
> > that the mapping (i, j)->indom_name will stay the same in an archive?
> > 
> > Somehow I assume that is not the case (pmcd restart with new PMDA, etc.),
> > but maybe I'll get lucky ;)
> 
> Heh, yes and no.  PMDAs are required to make all efforts possible to
> ensure that the instance-ID-to-name mapping is stable.  There's even
> a series of APIs available - pmdaCache(3) - to allow them to persist
> across PMDA restarted (even across reboots).
> 
> However, there's always the corner cases.  One example is per-process
> metrics - these use PIDs as the instance ID, which of course can be
> reused by the kernel and spring up afresh with a different process
> behind them at some later time.
> 
> So, in general this is a good optimisation - but there are some cases
> (exceptional cases, definitely not the norm) where PMDAs are unable to
> come to the party.  There are a number of tools that do insist on the
> stable mapping (or assume it), so its not a terrible thing on the part
> of your tool if you choose to keep this optimisation.

ah, I see. Thanks for taking the time to explain this to me. It makes
sence.I think I'll take correctness at all costs over speed in this case.
Right now this is where I am at with the output:
http://acksyn.org/software/pcp2pdf/output.pdf

More or less it covers what I had in mind. I'll now work on a pcp branch
and make sure I use or extend the pmcc class as needed, move to
pmOptions and add the appropriate qa tests. I'll ping you to take a look
once it's ready, if that's okay.

Thanks and regards,
Michele
-- 
Michele Baldessari            <michele@xxxxxxxxxx>
C2A5 9DA3 9961 4FFB E01B  D0BC DDD4 DCCB 7515 5C6D

<Prev in Thread] Current Thread [Next in Thread>