pcp
[Top] [All Lists]

RFC: filtered metrics

To: PCP Mailing List <pcp@xxxxxxxxxxx>
Subject: RFC: filtered metrics
From: Nathan Scott <nathans@xxxxxxxxxx>
Date: Wed, 25 Sep 2013 05:11:21 -0400 (EDT)
Delivered-to: pcp@xxxxxxxxxxx
Reply-to: Nathan Scott <nathans@xxxxxxxxxx>
Thread-index: LBK8LIfWBIvSyRFHUjtEW0BbMF/seg==
Thread-topic: filtered metrics
Filtering Metrics
=================

Goal: Transparent mechanism for client tools to perform a metric
modification operation (likely a pmStore(3)) before starting to
fetch values for metrics.  This is to provide a per-user or per-
monitor-host customisation (filtering) of the values returned.

This would be analogous to the existing pmAddProfile(3) mechanism
which allows instances to be restricted (a per-indom and instance
identifier based mechanism).  Like those profiles, this mechanism
would need to be sent post-connect and also allow re-transmission
after a successful context re-connection.


Rationale:

We've now observed a number of situations where clients need to be
able to perform "server-side filtering" of the result returned to
the client from pmcd.  Several examples follow to demonstrate this,
but they all suffer from being ad-hoc - some clients support them,
but most do not and have no mechanisms/plans for supporting them.

In particular, pmlogger is a problematic case where it would be good
if it could trigger these server-side-filters for its fetch requests
but its configuration language has no mechanism to allow the filters
to be sent.  A solution outside of all client tools (so, done within
libpcp, transparently) would be ideal.  Something along the lines of
the "derived metrics" model (which could also be improved somewhat,
in terms of usability, at the same time we tackle this - see below).


Existing Examples:

pmdalogger and pmdabash
  -  The event metrics in these PMDAs were found to warrant a store
     operation before their use, for two reasons.
  -  Firstly, it provided a simple permissions model where store-
     access could be revoked (by host) and only allowing events to
     be fetched after receipt of a store PDU in the PMDA.
  -  Secondly, real server-side filtering could be performed in the
     form of a regular expression to be applied to matching event
     data (log lines or command strings) reducing the data that we
     need to hold in-memory on the PMDA, and send-over-the-wire to
     clients (and, in theory, on-disk from pmlogger).
pmdaproc threads and cgroups
  -  The per-process instance domain can be filtered to contain all
     processes including threads, or without threads.  The filtering
     mechanism here is an integer (zero/one - off/on), not a string
     regex as before.
  -  It can also be filtered to just the processes within a cgroup
  -  The cgroup filtering can only be performed per-context, so we
     cannot use this filtering in tools like pmlogger which have no
     knowledge of how to set this up.
pmdasystemd
  -  Is likely to require server-side event filtering in its next
     major update (in addition to use of user credentials).


Approach:

Perhaps we could introduce known directories (/etc/pcp/filtered/ and
$HOME/.pcp/filtered?) with files containing metric name:value mappings
to store (i.e. store <value> to named <metric> with the value type as
defined by <metric> descriptor) at the same points that we perform a
profile send today.  An environment variable, PCP_FILTERED_CONFIG can
also be used, analagous to the existing derived metrics model.  This
is a little awkward for pmlogger_daily and friends, but will be handy
for command-line-invoked monitoring tools.

At the same time, we could consider whether we extend the directory
expansion scheme (host-wide and per-user) to add derived metrics to
clients without having to set an environment variable (in addition to
that existing mechanism).


Issues:

Tools like pmchart may want to present UI to allow addition/removal
of filters (with complexity along the lines of filtering in wireshark
perhaps?).  Is an API needed?  Probably.

Should we mandate use of string metrics always for these things?  (the
one case we have that used integers could be done as a string instead,
and its not yet released).  This would mean we don't need to do any
descriptor lookups, and type checking - its always a string, straight
out of the filter file. (?)


--
Nathan

<Prev in Thread] Current Thread [Next in Thread>
  • RFC: filtered metrics, Nathan Scott <=