pcp
[Top] [All Lists]

Derived Metrics - RFC

To: pcp@xxxxxxxxxxx
Subject: Derived Metrics - RFC
From: Ken McDonell <kenj@xxxxxxxxxxxxxxxx>
Date: Wed, 04 Nov 2009 18:22:23 +1100
Reply-to: kenj@xxxxxxxxxxxxxxxx
I'm returning to one of the PCP "holy grails" ... derived metrics.

For those who've not been down this tortured path before, the idea is
that one could define one or more derived metrics in terms of an
arithmetic expression over existing metrics, and values for the derived
metrics would be available just like regular PCP metrics.

The most common examples are ...
      * the requirement to compute something that is "delta(value) /
        delta(other_value)", e.g. average time per message where both
        "time" and "number of messages" are counters
      * aggregation of existing metrics, e.g. "messages" being the sum
        of synchronous and asynchrous message sent and received, or
        total packet rate across all gigE interfaces

This all belongs per PMAPI client, so it works for performance metrics
from pmcd and from archives.  Another reason for this being a
client-side feature is that it should be available without access to, or
the capability to, reconfigure the PMDAs and/or pmcd on the collector
system. And finally the stateful computation of delta(value
Let me know your thoughts.

Cheers, Ken.
) / delta(other_value) can only be sensibly done at the per-process
level on the client side.

So one could consider a derived metrics module to be an old SysV-style
streams module that has a private configuration file and is inserted
between a PMAPI client and the source of PCP metrics.

I'd like some initial feedback on the following set of initial
limitations and assumptions.
     1. Only works for platforms with ELF binaries.  I'm planning to use
        $LD_PRELOAD to optionally insert a DSO between the PCP client
        and libpcp.so to intercept calls and rewrite them as needed.
     2. Only guaranteed to work for the synchronous PMAPI variants.  So
        for example I'll make pmLookupName() work, but invest no effort
        in the asynchronous pair pmRequestNames() and pmReceiveNames().
        As an aside does any know of a living user of these asynchronous
        interface extensions to the original PMAPI?
     3. Configuration file pathname comes via a new PCP environment
        variable, probably $PCP_DERIVED_CONFIG.
     4. No recursive definitions.  Each derived metric is an expression
        involving metrics that are NOT derived.
     5. Performance is not an issue.  Some of the re-writing is not
        going to be cheap, especially in terms of the demands on
        *alloc().

Some of this is piggy-backing on knowledge gained from the recent
dynamic PMNS changes in terms, although the implementation is disjoint.

I have the bones of a proof of concept implementation, so there is a
real chance it may happen this time around.

Let me know your thoughts.

Cheers, Ken.

<Prev in Thread] Current Thread [Next in Thread>