pcp
[Top] [All Lists]

python API quirks (was Re: pcp updates: kenj merge, numastat, memleak fi

To: "Frank Ch. Eigler" <fche@xxxxxxxxxx>
Subject: python API quirks (was Re: pcp updates: kenj merge, numastat, memleak fixes, qa, qa, qa)
From: Nathan Scott <nathans@xxxxxxxxxx>
Date: Mon, 14 Apr 2014 00:50:26 -0400 (EDT)
Cc: PCP Mailing List <pcp@xxxxxxxxxxx>
Delivered-to: pcp@xxxxxxxxxxx
In-reply-to: <y0m61mf7w89.fsf@xxxxxxxx>
References: <1821183812.3777009.1397200271225.JavaMail.zimbra@xxxxxxxxxx> <1429874976.3777376.1397200365901.JavaMail.zimbra@xxxxxxxxxx> <y0m61mf7w89.fsf@xxxxxxxx>
Reply-to: Nathan Scott <nathans@xxxxxxxxxx>
Thread-index: Sn1qviGNjFeBjnmQsoW9sGC0YMK8Hg==
Thread-topic: python API quirks (was Re: pcp updates: kenj merge, numastat, memleak fixes, qa, qa, qa)

----- Original Message -----
> 
> Just took a look at the new pcp-free.py script; nice and short.
> Some aspects though concern me:
> 
> - a python-binding issue where clients must manually manage
>   pmResult memory, as in:

Ayup, this concerns me also.

> 136        result = self.context.pmFetch(pmids)
> 137        values = self.extract(descs, result)
> 138        self.context.pmFreeResult(result)
> 
>   Can we hide this within the bindings (implicitly extracting
>   values, and freeing the pmapi level results)?

Not easily, having tried a couple of things.  My first thoughts were
to attempt to handle this in the destructor, but thats problematic (we
should *also* do that I think, but its not enough).  The problem is we
often need to fetch in a loop, and the destructor is not going to be
triggered until some time after that, after the result has gone out of
scope (and after we've leaked oodles of memory from repeated alloc's
of the pmResult structure).

I believe a more clean, pythonic approach will be to use the "with"
keyword... something like:

    with self.context.pmFetch(pmids) as result:
        values = self.extract(descs, result)
        self.report(values)

This evidently calls the __exit__ handler at the right time - we'd need
to add one of those.  However, it all falls down there - the return from
pmapi.pmFetch is a pmResult ctypes *pointer*, not a pmResult.  This is a
synthesized type, not one of the types we define.  Hence there's nowhere
to implement this __exit__ routine.

Maybe we can create a new class inheriting the POINTER(pmResult) class
and return that from pmFetch, then have an __exit__ on that class... not
sure, further experimentation required.

> - a coding-style issue, where this client makes assumptions about the
>   scaling values of metrics, specifically report(), which assumes all
>   values[] are in Kbytes.  This happens to be true today, but there's
>   no guarantee that metric units/scales will never change.  Such apps
>   should instead use pmConvScale to assert/adapt.

*nod*.  In practice that is exceedingly unlikely for these particular
metrics (kernel ABI is fixed, literally thousands of tools would break
if it changed), but you're certainly right in general - I shall go make
this change pronto.

> [...] the python pmapi bindings
> should automate metric value fetching / scaling / decoding fuller
> (and more declaratively if possible).

Definitely.  I believe the original python API author was intending that
the "pmcc" module would aid in solving these sorts of issues.  Handling
of rate conversion for counters is another candidate for abstraction...
anything of a level of complexity beyond these simple tools (which don't
even need to rate convert), should really be via an API.  I think Stan
went with a new pmsubsys module to abstract some of the atop / collectl
common code in this way.  We are at risk of producing more and more APIs
above the tools doing the same things, differently.  This happened with
the C tools, heh, you'd think we'd learn.  :)

Doesn't overly affect these super-simple tools, but building up to the
next level of complexity we need to start thinking about these things.

cheers.

--
Nathan

<Prev in Thread] Current Thread [Next in Thread>