[pcp] pmlogreduce - use by date has expired
kenj at internode.on.net
Sun Jan 10 14:12:04 CST 2010
After the unplanned intervention of dynamic metrics in the pmns and
derived metrics I'm thinking about getting back to this long-standing
I'll respond to the first round of feedback in the hope that this will
trigger some more suggestions and comments.
On Mon, 2008-09-15 at 15:24 +1000, Nathan Scott wrote:
> Hi Ken,
> On Thu, 2008-09-11 at 15:36 +1000, Ken McDonell wrote:
> > ...
> > Some Things that WILL be Supported
> > The existing pmlogreduce attempts some of the list below, but most of
> > these features are either not implemented, or implemented incorrectly
> > in the current code.
> > * The temporal reduction is achieved by the -t delta command
> > line option. The output archive will contain observations at
> > most once per delta for each metric-instance pair in the input
> > archive.
> Have been wondering to myself whether the ability to have
> set of values recorded at different frequencies in the new
> log would be useful (iow, different -t for different sets
> of metrics) ... like pmlogger allows. I'm undecided, but
> have you given that option thought? Complicates things a
> fair bit, I guess, I'm leaning toward "probably not worth
> it" but just thought I'd mention it.
I've added this to a new section "Some Things that COULD be Supported"
to collect worthy ideas that are outside the scope of my first rewrite
> > * The size of the output archive may be limited with the -s
> > command line option.
> How does that combine with -t? (when the size limit is hit,
> it just ends the archive & warns user?)
The -s would be a sample limit, so stop after N results in the output
archive. So the duration of the output archive would be N times the
delta from the -t option. I don't think there is an issue of confusion
here, and this simply maintains the existing behaviour.
> > * Multi-volume output archives will be supported through the -v
> > command line option and internal volume switching logic to
> > ensure the 32-bit offset limit of the temporal index is not
> > exceeded.
> Should that be automatic and only if needed? (no -v)
OK, I agree with the automatic volume switching (although the testing
regime for archives of 2^32 bytes is a little scary). It turns out that
-v is currently parsed from the command line and then ignored ... I'd
like to fix this. The argument for size-limited archives is weak (has
to do with file copying logistics) but is carried through from
pmlogger ... at some point it may make sense in a major release to
simply retire the -v option for all tools.
> > * Counters will be rate converted (so mapped to INSTANTANEOUS
> > metrics, have their semantics changed when the TIME DIMENSION
> > is reduced by one, e.g. MBYTE -> MBYTE / SEC, and their TYPE
> > will be converted to DOUBLE).
> This could potentially make larger output files than input files.
> Would an option for FLOAT instead of DOUBLE be useful to prevent
> that phenomenon?
The "larger" scenario could only really happen if the -t for pmlogreduce
was about twice the value used -t for pmlogger, which seems unlikely in
most real uses of pmlogreduce. The difference in size per instance
value between the insitu types, float and double is as follows:
insitu 8 bytes
float 16 bytes
double 20 bytes
So the float vs double saving is only about 25%. But more relevant is
being realistic about precision ... a float is more than enough to
represent the real precision of the numbers we're dealing with,
especially after interpolation and reduction. I've changed this to
FLOAT as the output format.
> > Some Open Questions
> > The following issues warrant some discussion before I make unilateral
> > decisions.
> > 1. Output Window Clipping. In several useful deployments of
> > pmlogreduce one may wish to further restrict the temporal
> > domain by selecting some re-occurring periods to be included,
> > and some to be excluded. Examples might be between the hours
> > 08:00 and 20:00 each day, and/or each day excluding Saturday
> > and Sunday. There are several problems here:
> > 1. suitable command line syntax to specify this sort of
> > clipping
> > 2. what would the output archive contain - no pmResult,
> > or pmResult and no metrics (which is formally a MARK
> > record) for each delta in the "clipped" region
> I'd go for the former just to save space, in absence of a
> compelling reason either way.
> In my local use-case-scenario, I'd imagine we'd be doing
> this clipping via logextract in the first level of daily
> archive to some other interval (weekly/monthly/...) log
> munging (which will also reduce the set of metrics stored
> longer term, etc), and then running logreduce on that -
> so we'd have no reason to need this AFAICS. But perhaps
> other use-cases would call for it.
I'm going to move this into the "COULD" section. As you point out there
are other ways of achieving the same end result.
> > 1.
> > 1. Should DISCRETE metrics appear in the output only if there is
> > a value observed in the corresponding interval in the input
> > archive? The alternative is to have all metrics repeated in
> > every pmResult in the output archive.
> That doesn't seem a good alternative - I'd go with the first
> option, or use the last previous value seen (may be outside
> the window) for discrete metrics.
Agreed, and now in the WILL be Supported section.
> > 1. For DISCRETE metrics, and all but the last value before a MARK
> > record or the end of the input archive for INSTANTANEOUS
> > metrics, consecutive identical values can be omitted without
> > changing the data semantics - is this worth it?
> I think so. If these are string valued (like topology metrics,
> or some such thing) these could waste plenty of space.
Also added to the WILL be Supported section under the general category
of suppression of repeated values.
> > 1. What to do with COUNTER metrics that have a TIME dimension
> > other than 0 or 1? I don't know that we have any such
> > metrics, and I'm not sure what the real semantics of data like
> > this might be, but it seems pretty obvious that "rate
> > conversion" is not going to make the semantics any more
> > obvious!
> Yeah, just leave as-is I guess.
Yes, I agree.
> > 1.
> > 2. For INSTANTANEOUS and DISCRETE metrics with non-numeric
> > values, we have to decide what to do if multiple observations
> > appear in the input archive within a single output archive
> > time interval. Take the last observed value seems to be the
> > least worst thing to do.
> Yep, agreed.
On Tue, 2008-09-16 at 15:14 +1000, Max Matveev wrote:
> On Thu, 11 Sep 2008 15:36:08 +1000, Ken McDonell wrote:
> kenj> Counters will be rate converted (so mapped to INSTANTANEOUS
> kenj> metrics, have their semantics changed when the TIME DIMENSION
> kenj> is reduced by one, e.g. MBYTE ->; MBYTE / SEC, and their TYPE
> kenj> will be converted to DOUBLE).
> What if instead of converting counters to instantaneous metrics you
> simply accumulate them over the new interval and leave them as just
> counters with optional conversion to double if you're concerned about
> wrapping. That should solve your problem with non-obvious temporal
The problem here is MARK records and pmcd restarts ... doing the
piece-wise integration of the counter values over the time intervals
where we do have data and then using the time average over the output
interval is, I think, the best way to aggregate the available
information. Consider this example
time ctr value
Now if the output time interval was 40, then using counter semantics I
don't think I can compute the value at time 40. But using instantaneous
averaging I can compute the rate to be 12 for 39 of the 40 seconds in
the output interval, so a value of 12 would be used in the output
On Fri, 2008-09-12 at 14:07 +1000, Mark Goodwin wrote:
> I guess this might enable another holy grail: derived metrics across more
> than just the temporal domain. And even more strangely, archives containing
> data from more than one host.
It is going to be really hard to incorporate derived metrics into
pmlogreduce because it uses pmFetchArchive for the heavy lifting, and
without a list of target pmids the derived metrics stuff has no chance
to do its thing. Of course, derived metrics could always be used with
the output archive from pmlogreduce.
What did you have in mind for "more than just the temporal domain"?
Archives spanning multiple hosts is not going to happen in my lifetime,
or at lease won't be done by me. This is orthogonal to the whole design
centre for archives which aimed to make an archive semantically as close
to a real-time source of metrics as we could manage. And I just don't
see the compelling need for data from more than one host in a single
archive, as opposed to tools processing one archive per host, as is
currently done by pmie, pmchart, pmdumptext, ...
> Also, pmid remapping or aliasing would be a good feature to have, but
> maybe that's a job for a different tool.
More information about the pcp