pcp
[Top] [All Lists]

Re: [pcp] Multi-Archive Contexts: Scaling and Consistency

To: Dave Brolley <brolley@xxxxxxxxxx>
Subject: Re: [pcp] Multi-Archive Contexts: Scaling and Consistency
From: Nathan Scott <nathans@xxxxxxxxxx>
Date: Mon, 16 Nov 2015 18:35:59 -0500 (EST)
Cc: PCP Mailing List <pcp@xxxxxxxxxxx>
Delivered-to: pcp@xxxxxxxxxxx
In-reply-to: <564A3307.4010200@xxxxxxxxxx>
References: <564258F5.20309@xxxxxxxxxx> <1476262521.9272541.1447218203575.JavaMail.zimbra@xxxxxxxxxx> <564A3307.4010200@xxxxxxxxxx>
Reply-to: Nathan Scott <nathans@xxxxxxxxxx>
Thread-index: 2FNgGjZFQ3PLgkXsOkmd/nIHMjm1gA==
Thread-topic: Multi-Archive Contexts: Scaling and Consistency
Hi Dave,

----- Original Message -----
> My feeling on this is that it will not be difficult to delay reading all of
> the .meta data until necessary and that I see no need to force this upon
> tools which do not need to do so, even if they turn out to be in the
> minority I will make sure to isolate the code so that this decision can be
> easily reversed.

Sounds good.

> But are disappearing archives expected to be handled within an open libpcp
> context? Even the current single-archive contexts only keep one volume open

In the case of a directory (like /var/log/pcp/pmlogger/<host>) I think we
should expect "good" behaviour for archives disappearing.  In the case of
disappearing off the tail of the time window (the most important case, I
suspect) this should just equate to PM_ERR_EOL instead of exposing ENOENT.
And yep I agree - in an initial implementation, that'll be enough - I'd
not worry at first about archives disappearing elsewhere in the timeline.

> at a time. If we were to return to a previously used volume which has
> disappeared, for any reason, including log rotation, while the context is
> open, an error will occur when we attempt to re-open the now-nonexistent
> volume. I don't think that a set of archives within a multi-archive context
> can or should be treated any differently.

The directory-of-archives situation is slightly different though (hmm, this
is an interesting case indeed, where we might want the behaviour for a comma
separated archive list (IOW, multiple, explicitly-named archives) to differ
to that of a directory.

> [...] I am leaning toward having libpcp automatically handle new archives
> which appear at the end of the overall timeline.

I agree, that will be an excellent starting point.  Do think about using data
structures that'll make it easy to generalise further though, down the track.

> Other than for the case you give above of two actively-being-written archives
> within the same directory, the difficulty comes not in inserting new
> archives into their proper place in the overall timeline but with with
> difficulty with new metrics being introduced in the middle of an area of the
> timeline which has already been traversed.

Keep in mind the tools that will care - most tools will just run, process the
data that is available then-and-there and exit - none of those tools care at all
about the appear/disappear cases.

The ones that do are long-running - pmchart, pmwebd - I suspect these will both
self-correct, even in the situation where data appears "in the middle" of a time
sequence (pmchart plots "no data", and then later will find data to plot if it
moves back over that time sequence).  So, this may not be as difficult as it
first appears.

> The best example I can come with would be having archives A and C initially
> and having traversed from archive A into archive C while scaling the results
> of some counter. The final sample from A and the initial sample from C will
> be interpolated to produce a rate that is constant for queries within the
> gap between A and C. Archive B now appears in that gap and the same timespan
> is traversed again. This time the detailed rates are given for the timespan
> of B. Is this a problem? Maybe, maybe not.

Pretty sure we cannot attempt continuation across archives like this FWIW - an
implicit mark record exists at the start/end of any archive, right?  That will
need to be maintained in the new world order - definitely no interpolation of
counters across archives (see how pmlogextract handles this, inserting a mark
record explicitly at the old boundary).

But, I agree - don't worry too much about this case yet, its not needed for an
initial implementation, and I think the way you've described tackling it will
work out well.  Using a tree over an array is mainly intended to help with your
scaling concerns, and just happens to have nice insertion properties if/when we
come back to a more dynamic model of archives appearing/disappearing within the
time series.

cheers.

--
Nathan

<Prev in Thread] Current Thread [Next in Thread>