I re-read this thread extracting the ideas that I need to focus on for
the initial task of multi-archive support. When I got to the end, I
found that Ken had already pretty much summed it up ...
On 10/04/2014 04:19 PM, Ken McDonell wrote:
As others have pointed out ... each -a gets mapped to a context, so we need
some sort of syntax that can name more than one archive in a single command
line argument to be used with -a ... so this leads to the following options:
- dirname
- glob-like , probably not just * but the whole shooting match of ?, [...]
and {...,...}
- a list, e.g. -a 20141001,20140930
Now this could be handed off to pmNewContext, and the client could use a
single PMAPI context as a handle to access this _set_ of archives
For this to work, we need some restrictions on the set of archives that can
be combined in this way:
- all for the same host
- non-overlapping time windows
If these are not satisfied, pmNewContext needs to return a (new) error code.
Then we need to consider the metadata:
- timezone could change - this will require some further investigation
before a cunning plan can be proposed
- PMNS - merge 'em all the while there are no conflicts ... in the case of a
conflict (different names map to the same PMID or the same name is assigned
more than one PMID) we probably need dynamic remapping ("first one found
wins" is probably the right strategy)
- metric descriptors - if these change it gets very messy, although is rare
in practice
- instance domains - should be close to OK, as these are already expected to
vary over time ... it would be bad if the semantics of the instance domain
members changed between archives, but this is more of a PMDA botch issue
than a problem for libpcp to solve
One simple solution that might be acceptable for 95% of the cases would be
to rule all of the metadata data differences (except instance domains) to be
unsupported. So pmNewContext would fail. The user's option for resolving
this is to use pmlogrewrite to amend one or more of the archives and remove
the differences. I think this is definitely an OK plan.
I like Ken's thinking of this as a set of archives and, I think that the
restrictions that he has suggested (non-overlapping time windows, all
from the same host) are practical to begin with (perhaps it could
someday be possible to deal with overlapping time windows).
The idea of disallowing meta data differences also seem like a good
starting point, but I imagine that the idea of remapping (also
mentioned) is possible as an enhancement. I'll ask Ken to elaborate on
when he meant by "first one found wins" if/when we decide to do that (or
any time before then that he has time to do so).
Thanks to everyone for getting me going in the right direction.
Dave
|