----- Original Message -----
> > > As noted before, there was nothing unusual about my procfs
> > > configuration. Your existing cgroups-root-001 tarball should show the
> > > exact same problems. It was the test code that has been deficient
> > > (not doing enough operations to hit the fd-leak/exhaustion limits, for
> > > example), not the data.
> >
> > *sigh*, no - I'm looking to expand the testing coverage and you clearly
> > had many more cgroups setup than I did & on a more recent kernel version
> > where we have no test coverage yet. [...]
>
> I looked into this further. As promised, the problem is reproducible
> with the existing cgroups-root-001 tarball,
That's great - thanks for looking into it & extending the test.
Setting that aside briefly, coverage would still be enhanced with the tgz
I'm asking for. Relative to the single cgroups-001.tgz case I made, it'd
give coverage for a different kernel version (the contents of some of the
cgroups files differs in more recent kernels).
> [...] but this is made more
> difficult by the test case's construction. This test uses the .so
> pmda variant & pminfo -L/-K runs, so that the test case can force-feed
> it the fake /proc data via $PROC_STATSPATH.
>
> In an echo of early problems with the papi-pmda qa, this style makes
> leaks difficult to find, because they are so ephemeral: you can't just
> do a pminfo loop to exhaust the resources, because they are recreated
> anew for each pminfo!
The problem is the choice of client tool, not an inherent limitation of
the testing as you're suggesting. The case you're reproducing here is
calling for a long-running client that issues many fetches, but you're
coming at it with pminfo - try pmval instead?
> Anyway, it is possible to trigger the problem even with the .so pmda
> variant, just clumsier. Behold pcpfans.git fche/cgroups-test:
Yep, that's very awkward, and I think your rationale (as in the test
comments added there) is not quite right. It could be implemented
more cleanly via pmval, without relying on the batch-fetch logic in
pminfo as I think this change does.
It'd be excellent to get this long-running client case going with use
of valgrind too (i.e. qa/731, not just the qa/730 modified here) --
pmval with a short sampling interval (a millisecond or two) would do
the trick I think and it'd be a good addition to both scripts. More
readable too, for the next person working on these tests.
cheers.
--
Nathan
|