pcp
[Top] [All Lists]

Re: pmlogreduce - use by date has expired

To: kenj@xxxxxxxxxxxxxxxx
Subject: Re: pmlogreduce - use by date has expired
From: Mark Goodwin <markgw@xxxxxxx>
Date: Fri, 12 Sep 2008 14:07:10 +1000
Cc: pcp@xxxxxxxxxxx
In-reply-to: <1221111368.25428.11.camel@bozo>
Organization: SGI Engineering
References: <1221111368.25428.11.camel@bozo>
Reply-to: markgw@xxxxxxx
Sender: pcp-bounce@xxxxxxxxxxx
User-agent: Thunderbird 2.0.0.16 (Windows/20080708)


Ken McDonell wrote:
Doing temporal data reduction correctly for PCP archives has been an
itch I've had for about 12 years ... yep the itch predates pmlogmerge,
which predates pmlogextract, which predates pmlogreduce.  They all
managed to not solve the problem is assorted creative ways.

So, attached are my initial thoughts for a new pmlogreduce.

Comments most welcome before I start hacking too seriously.


------------------------------------------------------------------------


Proposal for a replacement pmlogreduce

Ken McDonell
kenj@xxxxxxxxxxx

In the open source PCP distribution, the existing pmlogreduce tool is a quick hack in response to:

   1. failure of both pmlogmerge and pmlogextract to meet their original
      slice-n-dice specifications,
   2. expediency for the SGI NASmanager product to be able to support
      PCP archives spanning days, weeks and months,

We effectively solved this by changing the nasavg PMDA to only use archives to prime the graph history (and limiting the duration), but then switch to live mode - i.e. the PMDA is also a live PCP client, running in what might be called head-up-your-own-ass-mode :)

I guess this might enable another holy grail: derived metrics across more
than just the temporal domain. And even more strangely, archives containing
data from more than one host.

Also, pmid remapping or aliasing would be a good feature to have, but
maybe that's a job for a different tool.

More comments when I have more time ...

Cheers
-- Mark

   3. getting the data semantics correct is at best hard, and in some
      cases impossible when the temporal domain is compressed.

This document outlines a plan to rewrite pmlogreduce to address the deficiencies of the current implementation.


Basics

    * One input archive - from either pmlogger or pmlogextract.
      Specifically, if you want to combine multiple archives and do data
      reduction, you'll need to:
         1. keep all the original archives
         2. concatenate them (and possibly filter them, see below)
            with pmlogextract
         3. then use pmlogreduce to apply the temporal reduction
    * One output archive.
    * Focus on semantically correct data reduction in the temporal domain.
    * We intended to preserve the semantics of pmlogger's output as much
      as possible.  In particular this means when the archive is
      processed with any of the standard tools, the value reported at
      time t is representative of the value that would have been
      observed over the interval up to time t.
    * The acid test of correctness should be that a reporting tool, e.g.
      kmchart, should produce the same results with either the input
      archive or the output archive when the reporting interval is set
      to the same delta as was used to create the output archive from
      pmlogreduce.


Some Things NOT Supported

    * Filtering of instances or metrics - pmlogextract does a fine job
      of this, and we're not going to make pmlogreduce even more
      complicated to support this functionality.
    * PMID re-mapping - if the PMID of a metric has the misfortune to
      change over its life, pmlogextract will choke and we never get to
      pmlogreduce.  The right way to address this would be an extension
      to pmlogextract or the binary PCP archive editor that has been
      part written and part threatened (pmlogneurosurgeon?).
    * Instance domain re-mapping - it seems the only same assumption is
      that the internal instance identifiers maintain constant semantics
      for each instance domain over the duration processed by pmlogreduce.
    * Changes in metric semantics.  Many of these are impossible to
      support, and the few that make sense require pmResult rewriting
      and should probably be done in a steroid-enhanced version of
      pmlogextract.

Since the variations that involve changes to metric semantics or metric metadata would have to make it through pmlogextract, the problem really belongs there, and pmlogreduce is effectively insulated from these ugly issues by the "I only accept one input archive" assertion.


Some Things that WILL be Supported

The existing pmlogreduce attempts some of the list below, but most of these features are either not implemented, or implemented incorrectly in the current code.

    * The temporal reduction is achieved by the -t delta command line
      option.  The output archive will contain observations at most once
      per delta for each metric-instance pair in the input archive.
    * The -A align command line option may be used to align the
      observations in the output archive to natural time boundaries.
    * The -S and -T command line options may be used to specify a
      starting and/or ending time window on the input archive (and hence
      the output archive).
    * The -Z and -z command line options are supported to vary the
      timezone interpretation of the -S and -T options.
    * The size of the output archive may be limited with the -s command
      line option.
    * Multi-volume output archives will be supported through the -v
      command line option and internal volume switching logic to ensure
      the 32-bit offset limit of the temporal index is not exceeded.
    * Counters will be rate converted (so mapped to INSTANTANEOUS
      metrics, have their semantics changed when the TIME DIMENSION is
      reduced by one, e.g. MBYTE -> MBYTE / SEC, and their TYPE will be
      converted to DOUBLE).
    * Counters that wrap between consecutive observations in the input
      archive will be treated as a single counter wrap and converted
      accordingly.  Note that if one or more MARK records separates the
      consecutive observations, the wrap conversion will not be done.
    * INSTANTANEOUS metrics with numeric value will be converted to a
      time-average.  For example, consider the input archive data below:

Time    Value
60      25
120     100
180     80
240     20


Then for the interval 100-200, the output value computed by pmlogreduce would be:
(25*(120-100)+100*(180-120)+80*(200-180))/100 = 81. Alternatively consider this to be the integral under the curve of the value over a time interval, divided by the length of the time interval.


    * Support for MARK records and missing data (at interval
      boundaries).  The notion of a confidence level will be introduced,
      with a -k percent command line option.  If the value for a
      metric-instance is defined over at least percent of the interval,
      then the corresponding value will be used as representative of the
      value over the whole interval - which is like saying the missing
      value was at the observed value for the remainder of the interval.
       A likely default percent is 85.
    * In the region of MARK records, the value will correctly be
      interpreted as unknown between the last observation and the MARK,
      and between the MARK and the first observation.  The one exception
      is DISCRETE metrics where a prior value is defined right up to the
      MARK record.
    * Dynamic instance domains will be supported.


Some Open Questions

The following issues warrant some discussion before I make unilateral decisions.

   1. Output Window Clipping.  In several useful deployments of
      pmlogreduce one may wish to further restrict the temporal domain
      by selecting some re-occurring periods to be included, and some to
      be excluded.  Examples might be between the hours 08:00 and 20:00
      each day, and/or each day excluding Saturday and Sunday.  There
      are several problems here:
         1. suitable command line syntax to specify this sort of clipping
         2. what would the output archive contain - no pmResult, or
            pmResult and no metrics (which is formally a MARK record)
            for each delta in  the "clipped" region
         3. there is no real tool support to replay and/or report on an
            archive of this style

2. Should DISCRETE metrics appear in the output only if there is a
value observed in the corresponding interval in the input archive?
The alternative is to have all metrics repeated in every pmResult
in the output archive. 3. For DISCRETE metrics, and all but the last value before a MARK
record or the end of the input archive for INSTANTANEOUS metrics,
consecutive identical values can be omitted without changing the
data semantics - is this worth it?
4. What to do with COUNTER metrics that have a TIME dimension other
than 0 or 1? I don't know that we have any such metrics, and I'm
not sure what the real semantics of data like this might be, but
it seems pretty obvious that "rate conversion" is not going to
make the semantics any more obvious!
5. For INSTANTANEOUS and DISCRETE metrics with non-numeric values, we
have to decide what to do if multiple observations appear in the
input archive within a single output archive time interval. Take
the last observed value seems to be the least worst thing to do.

--

 Mark Goodwin                                  markgw@xxxxxxx
 Engineering Manager for XFS and PCP    Phone: +61-3-99631937
 SGI Australian Software Group           Cell: +61-4-18969583
-------------------------------------------------------------


<Prev in Thread] Current Thread [Next in Thread>