On Sun, Apr 17, 2011 at 11:18 PM, Ken McDonell <kenj@xxxxxxxxxxxxxxxx> wrote:
> On Thu, 2011-04-14 at 16:22 +0200, Andrew Beekhof wrote:
>> ...
>> ---------- Forwarded message ----------
>> From: Andrew Beekhof <andrew@xxxxxxxxxxx>
>> Date: Wed, Apr 6, 2011 at 1:45 PM
>> Subject: Re: [Matahari] Forw: matahari: comparing Sigar and PCP for
>> data gathering
>> To: "Frank Ch. Eigler" <fche@xxxxxxxxxx>
>> ...
>> First up, thankyou for the very detailed analysis.
>> The challenge, from my perspective, is not so much "does PCP check the
>> same boxes as Sigar" but more "Is there something compelling in PCP
>> that makes the migration work worthwhile".
>>
>> To date, I'd have to say no.
>>
>> That said, and in contrast to what you're saying about PCP, Sigar
>> upstream is not exactly responsive.
>> So depending on what level of frustration we reach trying to get out
>> windows device name patch merged, this may well provide sufficient
>> justification to switch.
>
> Let me start by declaring my bias ... I am one of the original PCP
> architects and I've been actively involved in PCP for the past 18 years.
Understood :-)
>
> My knowledge of Matahari and Sigar is limited to reading the web pages,
> downloading Sigar source and a quick inspection of the code.
>
> In comparison to other performance monitoring uses of PCP, the Matahari
> demands appear to be modest in terms of the scope of metrics and
> services, so I am not surprised that other collection infrastructures
> like Sigar would be sufficient. Of course Matahari is doing many things
> (especially in the control area) that are outside PCP's capabilities.
>
> I suspect the real benefits of re-basing using PCP would be the future
> options for expanding Matahari or indeed introducing new performance
> management services that complement Matahari. Some of the PCP features
> that may be useful include:
>
> * Vastly more performance data is available "out of the box" with
> a plug-in architecture that allows new pools of information to
> be exported easily and efficiently ... this can be done within
> the PCP project or outside it. The "don't instantiate it unless
> you're asked for it" model in PCP means the quiescent overheads
> are very, very low, even if a large volume of data is
> potentially available.
Interesting
> * PCP's archive services allow independent decisions about what
> data should be logged and when. Once an archive has been
> created it is processed by clients using the _same_ API that is
> used for live collection and monitoring. This allows powerful
> retrospective analysis (what's different today compared to
> yesterday, or last week, or the previous software release?) and
> capacity planning.
> * PCP provides complete metadata for all the exported data, so in
> addition to the metric's name, you can discover that it is a
> signed 64-bit counter in units of microseconds or an unsigned
> 32-bit instantaneous value in units of Mbytes. This allows
> client applications to make sensible and automated decisions
> about how to handle the stream of values in terms of units
> conversion, scaling, rate conversion, wrap handling, etc.
Nod. QMF gives us a similar capability.
> * The client-server architecture of PCP means it already provides
> efficient and robust protocols for shipping performance data,
> meaning it is ready for both single node and multi-node
> monitoring for homogeneous clusters, federated clusters, arrays,
> grids, clouds, etc. It appears that part of Matahari may be
> doing the same thing to pull the performance data from Sigar, so
> there may be a potential for Matahari to leverage the PCP
> protocols and reduce Matahari complexity.
Matahari uses QMF/qpid as our comms bus, so its all neatly hidden away
from us :-)
> * If Matahari is expected to provide a range of alarms and alerts
> for performance-related issues, then the inference engine within
> PCP (pmie) is extremely powerful. pmie evaluates predicates (in
> a 1st order predicate calculus) against a stream of data with
> arbitrary actions executed when the predicates are found to be
> true. So rules that capture predicates like "if some network
> interface ..." or "if all cpus ..." are easy. Using the PCP
> APIs rules can be developed and tested with archive data before
> deployment on production systems.
The ability to fire of events under certain performance conditions
might indeed be an interesting capability.
I'll keep that in mind.
>
> That will do for now, but I think it is fair to say that the PCP
> community is both responsive and open to suggestions, so if there is
> some feature/function that would make a better fit with Matahari (or
> indeed any upstream value-added consumer of performance data), we'd be
> keen to have a discussion about that.
Thankyou very much for taking the time to explain a bit more about the
PCP project.
We'll certainly be in touch if we're in a position to migrate :-)
|