pcp
[Top] [All Lists]

Re: [pcp] Multi-Volume Archive + Live Data Playback for PCP Client Tools

To: Ken McDonell <kenj@xxxxxxxxxxxxxxxx>, "'PCP Mailing List'" <pcp@xxxxxxxxxxx>
Subject: Re: [pcp] Multi-Volume Archive + Live Data Playback for PCP Client Tools
From: Dave Brolley <brolley@xxxxxxxxxx>
Date: Wed, 29 Oct 2014 12:07:56 -0400
Delivered-to: pcp@xxxxxxxxxxx
In-reply-to: <007e01cfe010$7867f090$6937d1b0$@internode.on.net>
References: <542C21AE.1010504@xxxxxxxxxx> <007e01cfe010$7867f090$6937d1b0$@internode.on.net>
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101 Thunderbird/24.7.0
I re-read this thread extracting the ideas that I need to focus on for the initial task of multi-archive support. When I got to the end, I found that Ken had already pretty much summed it up ...

On 10/04/2014 04:19 PM, Ken McDonell wrote:
As others have pointed out ... each -a gets mapped to a context, so we need
some sort of syntax that can name more than one archive in a single command
line argument to be used with -a ... so this leads to the following options:
- dirname
- glob-like , probably not just * but the whole shooting match of ?, [...]
and {...,...}
- a list, e.g. -a 20141001,20140930

Now this could be handed off to pmNewContext, and the client could use a
single PMAPI context as a handle to access this _set_ of archives

For this to work, we need some restrictions on the set of archives that can
be combined in this way:
- all for the same host
- non-overlapping time windows

If these are not satisfied, pmNewContext needs to return a (new) error code.

Then we need to consider the metadata:
- timezone could change - this will require some further investigation
before a cunning plan can be proposed
- PMNS - merge 'em all the while there are no conflicts ... in the case of a
conflict (different names map to the same PMID or the same name is assigned
more than one PMID) we probably need dynamic remapping ("first one found
wins" is probably the right strategy)
- metric descriptors - if these change it gets very messy, although is rare
in practice
- instance domains - should be close to OK, as these are already expected to
vary over time ... it would be bad if the semantics of the instance domain
members changed between archives, but this is more of a PMDA botch issue
than a problem for libpcp to solve

One simple solution that might be acceptable for 95% of the cases would be
to rule all of the metadata data differences (except instance domains) to be
unsupported.  So pmNewContext would fail.  The user's option for resolving
this is to use pmlogrewrite to amend one or more of the archives and remove
the differences.  I think this is definitely an OK plan.

I like Ken's thinking of this as a set of archives and, I think that the restrictions that he has suggested (non-overlapping time windows, all from the same host) are practical to begin with (perhaps it could someday be possible to deal with overlapping time windows).

The idea of disallowing meta data differences also seem like a good starting point, but I imagine that the idea of remapping (also mentioned) is possible as an enhancement. I'll ask Ken to elaborate on when he meant by "first one found wins" if/when we decide to do that (or any time before then that he has time to do so).

Thanks to everyone for getting me going in the right direction.

Dave

<Prev in Thread] Current Thread [Next in Thread>