pcp
[Top] [All Lists]

Re: Metrics With More That One PMID

To: Dave Brolley <dave.brolley@xxxxxxxxxx>
Subject: Re: Metrics With More That One PMID
From: Ken McDonell <kenj@xxxxxxxxxxxxxxxx>
Date: Thu, 25 Feb 2016 08:21:55 +1100
Cc: PCP Mailing List <pcp@xxxxxxxxxxx>
Delivered-to: pcp@xxxxxxxxxxx
In-reply-to: <56CDE8D4.5000308@xxxxxxxxxx>
References: <56CDE8D4.5000308@xxxxxxxxxx>
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.5.1
G'day Dave.

On 25/02/16 04:31, Dave Brolley wrote:
> Hi Ken,
> 
> Waaaaaay, back when we first started talking about this, you advised me 
> to check the multi-archive meta data for
> 
>   * metrics with more than one PMID
>   * PMIDs with more than one metric name.
> 
> While looking into this, I realized that the latter case is explicitly 
> supported in the meta data file format. That is, we support PMIDs with 
> more than one name. Was there some other issue that you had in mind with 
> regard to this?

The latter case is OK.  My advise probably predates the changes I made to allow 
duplicate names in the PMNS by default ... so ignore the latter one as the 
infrastructure and tools will all deal with this seamlessly these days.

> The former case, is flagged as an error by AddPMNSNode(), but the error 
> is currently tolerated  by __pmLogLoadMeta() -- see line 250 of AddPMNSNode
> logmeta.c. The comment there is incorrect. PM_ERR_PMID is returned when 
> the same name has a different PMID. Is there some reason why this would 
> a problem in particular for a multi-archive context?

My comments related to the way the code was when I first wrote it.  I think the 
comment is correct ... if AddPMNSNode() returns PM_ERR_PMID __pmLogLoadMeta() 
charges on ... I am not sure this is the correct behaviour (and now we have 
pmlogrewrite, which also came after this code was last changed, I'd argue the 
original behaviour should probably be reinstated.

Consider foo -> pmid x and later in the metadata we find another entry for foo, 
but this time for pmid y.  Now there is _no_ way the PMNS data structures can 
support two different "foo" names mapping to different pmids (blah -> pmid x is 
fine and this is the "latter case" above).  The options for foo are:

1. abort (the original behaviour) ... recovery requires the user to choose 
which of pmid x or pmid y is correct, and then use pmlogrewrite to map x -> y 
(or vice versa) for all the archives that have the wrong pmid for foo

2. charge on (the current behaviour) and assume the first pmid you saw (x) is 
really correct ... this means none of the pmResults containing the pmid y 
version of foo will be visible if you ask for the foo metric (as this will only 
search for pmid x), and searches for foo may return values for some other 
metric (that really has pmid x) [see qa/507 comments below]

3. la la land ... build a map, assume x is correct and map y to x on the fly 
... pmDesc and pmResults all need to be scanned and potentially rewritten ... 
this is pmlogrewrite embedded in libpcp and I don't think this is ever going to 
be the right thing to do.

> FWIW, there is a a test case (507) for which the metric 
> pmcd.pmie.configfile has a different PMID in the given name space file 
> (src/root_irix) than it does in the given archive (archives/pcpcmd). 
> This test started failing when I added the check for metrics with more 
> than one PMID.

I think this demonstrates why the original behaviour was more correct than the 
current behaviour.  Consider this ...

$ pminfo -mf -a archives/pcpcmd pmcd.pmie.configfile
pmcd.pmie.configfile PMID: 2.5.7
    inst [2824710 or "2824710"] value "/var/pcp/pmdas/summary/expr.pmie"

$ pminfo -n src/root_irix -mf -a archives/pcpcmd pmcd.pmie.configfile
pmcd.pmie.configfile PMID: 2.5.0
inst [2824710 or "2824710"] value 0

In the first case we get the correct PMID for pmcd.pmie.configfile and the 
correct value.

In the second case we get the wrong PMID (2.5.0) from the explicit PMNS, then 
see a different PMID (2.5.7) from the archive and ignore this rather than error 
and then fetch a value for the WRONG metric based on the WRONG PMID.

It is a long shot, but Nathan may be the only who can explain why 
__pmLogLoadMeta() was changed, er, five and a half years ago.

<Prev in Thread] Current Thread [Next in Thread>