pcp
[Top] [All Lists]

Re: [pcp] Fwd: [Matahari] Forw: matahari: comparing Sigar and PCP for da

To: Andrew Beekhof <andrew@xxxxxxxxxxx>
Subject: Re: [pcp] Fwd: [Matahari] Forw: matahari: comparing Sigar and PCP for data gathering
From: Ken McDonell <kenj@xxxxxxxxxxxxxxxx>
Date: Mon, 18 Apr 2011 07:18:14 +1000
Cc: matahari@xxxxxxxxxxxxxxxxxxxxxx, pcp@xxxxxxxxxxx, "Frank Ch. Eigler" <fche@xxxxxxxxxx>
In-reply-to: <BANLkTimy0_jQmD_oevpXseVTxwuaoJF3mg@xxxxxxxxxxxxxx>
References: <mailman.24111.1301676824.5826.perftools-list@xxxxxxxxxx> <y0m8vvolock.fsf@xxxxxxxx> <BANLkTinmX+kcbiX+fcJgC1n7WTKJyjsf5g@xxxxxxxxxxxxxx> <BANLkTimy0_jQmD_oevpXseVTxwuaoJF3mg@xxxxxxxxxxxxxx>
Reply-to: kenj@xxxxxxxxxxxxxxxx
On Thu, 2011-04-14 at 16:22 +0200, Andrew Beekhof wrote:
> ...
> ---------- Forwarded message ----------
> From: Andrew Beekhof <andrew@xxxxxxxxxxx>
> Date: Wed, Apr 6, 2011 at 1:45 PM
> Subject: Re: [Matahari] Forw: matahari: comparing Sigar and PCP for
> data gathering
> To: "Frank Ch. Eigler" <fche@xxxxxxxxxx>
> ...
> First up, thankyou for the very detailed analysis.
> The challenge, from my perspective, is not so much "does PCP check the
> same boxes as Sigar" but more "Is there something compelling in PCP
> that makes the migration work worthwhile".
> 
> To date, I'd have to say no.
> 
> That said, and in contrast to what you're saying about PCP, Sigar
> upstream is not exactly responsive.
> So depending on what level of frustration we reach trying to get out
> windows device name patch merged, this may well provide sufficient
> justification to switch.

Let me start by declaring my bias ... I am one of the original PCP
architects and I've been actively involved in PCP for the past 18 years.

My knowledge of Matahari and Sigar is limited to reading the web pages,
downloading Sigar source and a quick inspection of the code.

In comparison to other performance monitoring uses of PCP, the Matahari
demands appear to be modest in terms of the scope of metrics and
services, so I am not surprised that other collection infrastructures
like Sigar would be sufficient.  Of course Matahari is doing many things
(especially in the control area) that are outside PCP's capabilities.

I suspect the real benefits of re-basing using PCP would be the future
options for expanding Matahari or indeed introducing new performance
management services that complement Matahari.  Some of the PCP features
that may be useful include:

      * Vastly more performance data is available "out of the box" with
        a plug-in architecture that allows new pools of information to
        be exported easily and efficiently ... this can be done within
        the PCP project or outside it.  The "don't instantiate it unless
        you're asked for it" model in PCP means the quiescent overheads
        are very, very low, even if a large volume of data is
        potentially available.
      * PCP's archive services allow independent decisions about what
        data should be logged and when.  Once an archive has been
        created it is processed by clients using the _same_ API that is
        used for live collection and monitoring.  This allows powerful
        retrospective analysis (what's different today compared to
        yesterday, or last week, or the previous software release?) and
        capacity planning.
      * PCP provides complete metadata for all the exported data, so in
        addition to the metric's name, you can discover that it is a
        signed 64-bit counter in units of microseconds or an unsigned
        32-bit instantaneous value in units of Mbytes.  This allows
        client applications to make sensible and automated decisions
        about how to handle the stream of values in terms of units
        conversion, scaling, rate conversion, wrap handling, etc.
      * The client-server architecture of PCP means it already provides
        efficient and robust protocols for shipping performance data,
        meaning it is ready for both single node and multi-node
        monitoring for homogeneous clusters, federated clusters, arrays,
        grids, clouds, etc.  It appears that part of Matahari may be
        doing the same thing to pull the performance data from Sigar, so
        there may be a potential for Matahari to leverage the PCP
        protocols and reduce Matahari complexity.
      * If Matahari is expected to provide a range of alarms and alerts
        for performance-related issues, then the inference engine within
        PCP (pmie) is extremely powerful.  pmie evaluates predicates (in
        a 1st order predicate calculus) against a stream of data with
        arbitrary actions executed when the predicates are found to be
        true.  So rules that capture predicates like "if some network
        interface ..." or "if all cpus ..." are easy.  Using the PCP
        APIs rules can be developed and tested with archive data before
        deployment on production systems.

That will do for now, but I think it is fair to say that the PCP
community is both responsive and open to suggestions, so if there is
some feature/function that would make a better fit with Matahari (or
indeed any upstream value-added consumer of performance data), we'd be
keen to have a discussion about that.

Cheers, Ken.

<Prev in Thread] Current Thread [Next in Thread>