pcp
[Top] [All Lists]

Re: Multi-Volume Archive + Live Data Playback for PCP Client Tools

To: "Frank Ch. Eigler" <fche@xxxxxxxxxx>
Subject: Re: Multi-Volume Archive + Live Data Playback for PCP Client Tools
From: Ken McDonell <kenj@xxxxxxxxxxxxxxxx>
Date: Tue, 04 Nov 2014 08:48:29 +1100
Cc: 'Dave Brolley' <brolley@xxxxxxxxxx>, 'PCP Mailing List' <pcp@xxxxxxxxxxx>
Delivered-to: pcp@xxxxxxxxxxx
In-reply-to: <20141103142643.GA3859@xxxxxxxxxx>
References: <542C21AE.1010504@xxxxxxxxxx> <007e01cfe010$7867f090$6937d1b0$@internode.on.net> <545110DC.2020104@xxxxxxxxxx> <y0mtx2hmwml.fsf@xxxxxxxx> <001d01cff6d7$0e9e4a50$2bdadef0$@internode.on.net> <20141103142643.GA3859@xxxxxxxxxx>
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.2.0
On 04/11/14 01:26, Frank Ch. Eigler wrote:
...
One glitch with that could be PMDAs whose PMNS is dynamic from run to
run (like the papi pmda, which is almost able to be used as a logged
data source).  The name-to-PMID mapping may vary there (as new PAPI
versions come, or host CPU changes, new counters(=metrics) may appear
in some odd sequence, so get scrambled PMIDs), but the name & pmDesc
(semantics etc.) would remain the same and previous values comparable.
>
I don't know how pervasive this situation would be, but if it's not
too hard to support, we should.  (e.g., we could track metrics across
archives by name rather than pmid.)

This is not pervasive at all ... this is the only PMDA I am aware of that behaves like this.

And I think I would be lobbying for the PMDA to be different, not the archive handling to be different.

There are already services available to a PMDA that allow the instance domain mapping (name <--> id) to be consistent across PMDA restarts. It would only be a small perversion of the pmdaCache* routine usage to allow the papi pmda to maintain a consistent and persistent metric name to low-order bits of the PMID mapping.

Note this does not need to be consistent between hosts, just consistent for a single host across repeated PMDA invocations (and possible changes in version and configuration of the PMDA).

Returning to the multi-archive work that triggered all of this, where there is a contradiction in the name <---> PMID mapping between archives, we have several cases and options:

Case 1

Multiple names map to the same PMID.

Options:

1a. Assume this is same metric and the PMID is correct, add both names to the PMNS (the data structure and libpcp routines support this, so for example, given the PMID pmNameAll() will return all the corresponding names). No rewriting of pmDesc or pmResult data structures is required.

1b. Assume these are different metrics. Invent a new (and unique) PMID for the subsequent ones, add all the names and their unique PMIDs to the PMNS. Synthesize a new pmDesc for each metric that gets an invented PMID. Rewrite pmResult data structures to map from the old PMID to an invented PMID, but in the context of the archive in which the name was mapped (need to choose the PMID based on which archive the pmResult came from).

Case 2

Multiple PMIDs map to the same name.

Options:

2a. Assume these are the same metric. Pick one PMID (probably the first one encountered ... Dave this is the "first" one is the winner part I was cryptically referring to in the earlier mail) and add mappings from all the names to the same PMID into the PMNS. There is only one pmDesc as a result, and in each pmResult any of the loser PMIDs need to be mapped to the winner PMID.

2b. Assume these are different metrics. Invent a new (and unique) name for the subsequent ones, add all the names and their associated PMIDs to the PMNS. No rewriting on pmDesc of pmResult data structures is required.

Case 3

N:M mapping between PMIDs and names.

Forget it, don't even thing about this one.

I was suggesting that initially none of this is supported, i.e. the name <--> PMID mapping was consistent across all archives in the set (Stage 0).

Then options 1a and 2b might be suitable for a Stage 1.

And Stage 2 might add support for all 4 options above (Frank's suggestion is 1b I think).

Note that all of the options above can be supported OUTSIDE libpcp today using (admittedly manual) pmlogrewrite step on one or more archives to remove the inconsistencies before the archive set is processed as a unit.

<Prev in Thread] Current Thread [Next in Thread>