pcp
[Top] [All Lists]

Re: pcp updates: pmdaproc, cgroups, books

To: "Frank Ch. Eigler" <fche@xxxxxxxxxx>
Subject: Re: pcp updates: pmdaproc, cgroups, books
From: Nathan Scott <nathans@xxxxxxxxxx>
Date: Thu, 8 Jan 2015 17:46:44 -0500 (EST)
Cc: pcp developers <pcp@xxxxxxxxxxx>
Delivered-to: pcp@xxxxxxxxxxx
In-reply-to: <20141219162030.GC11308@xxxxxxxxxx>
References: <1309338393.770280.1416292315684.JavaMail.zimbra@xxxxxxxxxx> <1666386574.2247920.1416427865663.JavaMail.zimbra@xxxxxxxxxx> <2132304544.16180073.1418360958577.JavaMail.zimbra@xxxxxxxxxx> <20141212061823.GC14953@xxxxxxxxxx> <53646500.16198226.1418365398316.JavaMail.zimbra@xxxxxxxxxx> <20141212164033.GD14953@xxxxxxxxxx> <1544484578.17657959.1418625363923.JavaMail.zimbra@xxxxxxxxxx> <20141219162030.GC11308@xxxxxxxxxx>
Reply-to: Nathan Scott <nathans@xxxxxxxxxx>
Thread-index: f9tb8OHp+dmujIgcg3wtY9/wbqGSAA==
Thread-topic: pcp updates: pmdaproc, cgroups, books
Apologies for the tardy reply, mayhem this week.

----- Original Message -----
> [...]
> Yes, except that pmval is limited to a single metric per invocation.
> A single "pminfo cgroup" run can exercise all metrics, and "pminfo
> cgroup cgroup cgroup ..." can endurance-test all metrics, not just the
> one that we found/fixed this leak in, so it's more forward-looking.

This is trivially tackled as in test qa/957 and results in a better
test - that one even points out which cluster the pmdalinux bugs it
finds are in (done that way on purpose, to save diagnosis time)!

For bonus points, we get valgrind testing on pmval which is missing
currently (may have overlooked the earlier review comment seeking
updates to qa/731 also?)

Or, simply use "pmdumptext cgroup" (but again, better to split up
the clusters to aid diagnosis).

> It's not the batch-fetching aspect of pminfo that made this test
> workable, but its willingness to traverse a PMNS hierarchy.

(the cgroup PMNS itself is static now, so no refreshing done there
anymore of course - its all about the fetch and/or instance PDUs)

Either I'm missing something, or you're missing what is happening
in proc_refresh - it doesn't matter how many PMIDs get thrown at a
single fetch, it will refresh each cluster only once per fetch...
so it has to be the fetch batching that triggers the leak, right?

If so, that's as subtle as anything - too much so for a test, IMO,
and using a different tool seems the better option (for this, and
the other reasons above).

cheers.

--
Nathan

<Prev in Thread] Current Thread [Next in Thread>