pcp
[Top] [All Lists]

RE: [pcp] Multi-Volume Archive + Live Data Playback for PCP Client Tools

To: "'Dave Brolley'" <brolley@xxxxxxxxxx>, "'PCP Mailing List'" <pcp@xxxxxxxxxxx>
Subject: RE: [pcp] Multi-Volume Archive + Live Data Playback for PCP Client Tools
From: "Ken McDonell" <kenj@xxxxxxxxxxxxxxxx>
Date: Sun, 5 Oct 2014 07:19:06 +1100
Delivered-to: pcp@xxxxxxxxxxx
In-reply-to: <542C21AE.1010504@xxxxxxxxxx>
References: <542C21AE.1010504@xxxxxxxxxx>
Thread-index: AQHclF1cpR2XxHkXIjv6In84hym83JwFf+EQ
G'day Dave.

Looking forward to the next chapter of Dave's Most Excellent Adventure!

Apologies for jumping in late, ... this is not my day job!

> -----Original Message-----
> From: pcp-bounces@xxxxxxxxxxx [mailto:pcp-bounces@xxxxxxxxxxx] On Behalf
> Of Dave Brolley
> Sent: Thursday, 2 October 2014 1:46 AM
> To: PCP Mailing List
> Subject: [pcp] Multi-Volume Archive + Live Data Playback for PCP Client
Tools
> 
> ...
> The current situation as I understand it:
> 
> *     PCP archives are created by pmlogger in distinct volumes due to
various
> constraints, such as a maximum file size of 2GB, the desire to allow
organization
> of the collected data, the desire to be able to manage data retention
(i.e. log
> rotation) and, undoubtedly, for other reasons as well.

There are 2 things going on here ...
- we have multiple archives because each archive accommodates metrics from a
single host (this is part of the archives and pmcd are interchangeable
sources of metrics, and part because of the difficulty of mixing metadata
from different hosts within a single archive format)
- we have multiple archives for management reasons - log rotation, backup,
etc
- within a single archive, we _may_ have multiple volumes - the 32-bit
offsets in the index file is the main reason for this (although this is
rarely reached in practice because of the multiple archives issues above)
and the other reasons were speculative at the time of the original design,
but never turned out to be important

> *     Some multi-volume support exists in the form of archive folios.
These can
> be created by mkaf(1) but are also created by some other tools, such as
> pmchart(1) and pmview(1). Archive volumes in a folio may be examined using
> pmafm(1) using its 'replay', 'repeat' or 'run' commands. The latter two
> commands allow for repeated application of PCP client tools against one or
more
> archives in the folio.

Folios have nothing really to do with multi-volume.  They are motivated by 2
different considerations:
1. For the pmlogger_check/pmlogger_daily managed archives, the name of the
archive is a timestamp which is variable and impossible to guess with any
certaintly ... the Latest folio alongside these archives creates as constant
"symbolic" name that can be used to manipulate the current archive.  The
Latest folio is created by pmlogger_check and pmnewlog.
2. For the gui tools (pmchart and pmview), there is a File->Record action
that creates a standalone PCP archive for whatever metrics are currently
being displayed.  Now these tools support concurrent display of metrics from
multiple hosts, so the "recording" may generate multiple archives, and also
the configuration of the tool that launches the "recording" is not fixed, so
the folio wraps up all the archives and the configuration of the "recording"
tool ... this is sufficient to allow the "replay" operation to be performed
after the archives have been created.

> *     The archive management tool, pmlogextract and indirectly,
> pmlogger_daily and pmlogger_merge, provide the ability to extract data
from
> multiple archives and combine that data into a single archive volume.

pmlogreduce is another tool in the same vein, but this one does statistical
reduction, while pmlogextract slices and dices in none, one or two of the
temporal and metric dimensions.

> *     Otherwise, PCP client tools are currently restricted to extracting
metrics
> from a single archive volume via PM_CONTEXT_ARCHIVE (the -a option). A
single
> archive volume and an option time window is specified, which is applied
against
> that single archive volume.

Some tools (pmie and pmdumptext, for example) can handle multiple archives
from different hosts, but restrict themselves to at most one archive per
host.
 
> What we would like have is for PCP client tools to have the ability to
easily extract
> metrics from multiple archive volumes. ...

I think the goal is to have the tools easily able to process multiple
archives from the _same_ host.  Not only must the archives be from the same
host, but they need to represent _disjoint_ time intervals (no overlapping
time between the archives) so they can be temporally sorted to provide a
single time series.

> ... Ultimately, we would also like tail-like
> following of an active archive volume with seamless transition from
archived data
> to live data.

While this is highly desirable, I suspect it will be a disjoint piece of
work from processing a set of temporally ordered and existing archives.

> Here are a few ideas for realizing these goals:
> 
> Client/tool interface:
> Currently only a single archive volume may be specified by its base name
(via
> PM_CONTEXT_ARCHIVE or -a). We could allow the specification of multiple
> archive specs, each of which could be:
> 
> 
> *     an archive volume file base name -- same as now
> *     the name of a directory containing a collection of PCP archive
volumes
> *     wildcarded names which resolve to a list of the above items
> 
> 
> For example,
> 
>    pminfo -a 20140930.0 -a 201408*.* -a /some/path/archives -a
> /another/path/archive*
> 
> PM_CONTEXT_ARCHIVE could be extended to support more than one archive
> volume.

As others have pointed out ... each -a gets mapped to a context, so we need
some sort of syntax that can name more than one archive in a single command
line argument to be used with -a ... so this leads to the following options:
- dirname
- glob-like , probably not just * but the whole shooting match of ?, [...]
and {...,...}
- a list, e.g. -a 20141001,20140930

Now this could be handed off to pmNewContext, and the client could use a
single PMAPI context as a handle to access this _set_ of archives

For this to work, we need some restrictions on the set of archives that can
be combined in this way:
- all for the same host
- non-overlapping time windows

If these are not satisfied, pmNewContext needs to return a (new) error code.

Then we need to consider the metadata:
- timezone could change - this will require some further investigation
before a cunning plan can be proposed
- PMNS - merge 'em all the while there are no conflicts ... in the case of a
conflict (different names map to the same PMID or the same name is assigned
more than one PMID) we probably need dynamic remapping ("first one found
wins" is probably the right strategy)
- metric descriptors - if these change it gets very messy, although is rare
in practice
- instance domains - should be close to OK, as these are already expected to
vary over time ... it would be bad if the semantics of the instance domain
members changed between archives, but this is more of a PMDA botch issue
than a problem for libpcp to solve

One simple solution that might be acceptable for 95% of the cases would be
to rule all of the metadata data differences (except instance domains) to be
unsupported.  So pmNewContext would fail.  The user's option for resolving
this is to use pmlogrewrite to amend one or more of the archives and remove
the differences.  I think this is definitely an OK plan.

> Extracting multi-volume data:
> For PCP tools, one very simple idea for extracting data from multiple
existing
> archive volumes would be to use pmlogextract(1)  ...

Probably does not move the game along enough to warrant consideration, and
involves a bucket load of I/O ... we can do better than this I believe.

> Streaming live data:
> ...

I suggest deferring this for another discussion ... it really is an
independent piece of work that could go before or after or in parallel with
the archive "set" changes.

Thanks for making the running on this one, Dave.


<Prev in Thread] Current Thread [Next in Thread>