pcp
[Top] [All Lists]

Re: [pcp] PCP papi PMDA

To: William Cohen <wcohen@xxxxxxxxxx>
Subject: Re: [pcp] PCP papi PMDA
From: Lukas Berk <lberk@xxxxxxxxxx>
Date: Wed, 2 Jul 2014 09:53:31 -0400
Cc: pcp@xxxxxxxxxxx
Delivered-to: pcp@xxxxxxxxxxx
In-reply-to: <53B1AF59.9080506@xxxxxxxxxx>
References: <20140625024307.GA9275@xxxxxxxxxx> <53B1AF59.9080506@xxxxxxxxxx>
Reply-to: Lukas Berk <lberk@xxxxxxxxxx>
User-agent: Mutt/1.5.23 (2014-03-12)
Hey,

Thanks for taking a look! Comments inline.

* William Cohen <wcohen@xxxxxxxxxx> [2014-06-30 14:41]:

[...]

> The following is more of an desired future feature that requires
> changes to current papi.  It would be really nice if this (and other
> pmdas) allowed grouping by cgroups.  This would be really useful for
> things using containers such as docker and openshift.  Then it would
> make it easy to pick out problem containers.

Absolutely agreed, definitely an end goal.

> Some processors do not implement all versions of
> pap.kernel.L[123].[IDT]CM listed in metrictab imlemented.  Do these
> fail gracefully in the papi pmda?

Currently papi pmda will just respond with a value of 0 for the given
(unavailable) metric, I'd agree that this needs to fail with a slightly
more explicit, and helpful, message.

> Is it expected when trying my locally built pcp RPM with papi patches
> I needed to manually do the following or is something missing for the
> install?
> 
>   cd /var/lib/pcp/pmdas/papi
>   sudo ./Install

Yes, aiui this is a fairly standard way to install pmdas.

> "pminfo -FT papi" only appears to work as root? Shouldn't there be a
> note on that? Or would it be possible to pminfo just check to see that
> the event is available within a process and extrapolate that it should
> be aavailable system-wide?  

Yes, use of perf counters is restricted to root/privileged users
currently.  I can definitely add a note about that (is there a precedent
to how pcp advises users on this?).  In the future, when per-process
use is supported, I think it would make sense to allow users to check
the metrics of their own processes, and then extrapolate that the
counter is available system wide for root use.

> Also got messages like the following for virtually all of the events
> (there isn't anything else using the pmu at the moment):
> 
> papi.kernel.L1_STM
> Help:
> Level 1 store misses.
> Error: Try again. Information not currently available

Yes, that makes sense.  With the activate-metric-when-asked-for
approach, the pmda takes the first request and activates the papi
counter and returns the error that the information isn't available yet.
Subsequent requests will provide the metric values.

> In papi_fetchCallBack() the switch statement codes the magic numbers
> to various entries.  Isn't there a more compact and regular way to
> code that?  Would it be possible to use negative number for control
> and have the events be positive to avoid having to offset some values by 1?

Looking at that portion of the code, you're absolutely correct.  After
refactoring out the PAPI_add_event code, there's no real reason to have
the switch statement in the fetchcallback.  FWIW the 'offset by 1' is an
artifact of the enable_counters variable (used in the pmstore enabling
counters approach).  I'll reorganize the pmns and fetchcallback function
as needed.

Thanks for the feedback!

Cheers,

Lukas

Attachment: pgpTwPfJN4Q2h.pgp
Description: PGP signature

<Prev in Thread] Current Thread [Next in Thread>