Hi, Nathan -
> [...]
> BTW, there are several changes in fche/for-merge that you've not put
> in the summary - any reason for that? Some of them look useful and
> possibly merge-ready too. Others, maybe not ready yet ... hmmm, who
> knows? It makes it impossible to just git pull into the dev branch,
> anyway, which slows my workflow.
I considered all the for-merge bits ready to go, just didn't want
to advertise them in the context of the python work.
> Comments inline... (I'm still hopeful Ken will review the new qa/666
> and offer suggestions - 6 minutes is still a really, really long time
> for a test to run - its >5 minutes longer than most other tests and
> about double the next slowest test. And *that* other test is already
> a big problem IMO so I'm very reluctant to add more QA laggards).
Fair enough, but I'm reluctant to keep working on it more without a
definite "good enough" target. (Of course one could reduce test
coverage to save time, but that's not dignified.)
> [...]
> > While in the vicinity, fix pmLookupName's exception message to
> > identify the metrics that failed resolution, instead of an internal
> > magic python char* rune.
>
> Hmm, not so sure about this - looking inside the libpcp code, errors
> returned here are not related to the names at all, right? (they are
> runtime errors, so things like PDU I/O errors, and so on).
In this particular case, the error parameter was for an informative
name, but the object passed back was the ctypes char* rather than the
python string, which made the error message unnecessarily opaque.
> Passing out the names doesn't really help to understand the failure,
> AFAICT? It looks like we should just treat this one the same as all
> the other pmErr exceptions (IOW, neither pmid/names list passed out).
> [...]
There are many cases where the pcp/pmapi.py code does something like:
raise pmErr, (status, nameA)
In these cases, returning legible parameters (a native python object
rather than nameA) would be helpful.
> What was the error case you came across here, OOC?
Sending bad metric names to pcp-graphite.py, either before or after
the PMNS traversal code-additions.
> [...]
> > commit 163db2e7ae96437a33ca8a0535af8bde35e95a34
> > Author: Frank Ch. Eigler <fche@xxxxxxxxxx>
> > Date: Sat Apr 12 21:46:51 2014 -0400
> >
> > libpcp __pmSetProgname: copy incoming string
> [...]
>
> Wow, amazed that I have not ever observed this! Do any/all of
> tests 739, 741, 742, 743, 991 fail for you? [...]
No.
> > In order to make it safer for mankind, __pmSetProgname henceforth copies
> > the incoming const char*'s with strdup().
>
> So, I'm trying to ignore the fact that you're once again introducing
> GNU coding style, into the middle of libpcp here (seriously? oh the
> humanity!).
Oops, you've uncovered my secret scheme to eventually convert the
whole codebase!
> But lets be more objective - all the people using valgrind on C
> PMAPI tools will gnash their teeth with this change, as they will
> start to get warnings now. [...] If we were to make this kind of
> change the qa/valgrind-suppressions file would need to be updated.
Did you see an actual warning from this change? For programs that
call __pmSetProgname only once (as is typical), there is no "leak"
in that a global variable maintains a reference to the object. I
can't get valgrind to even mention it.
> [...] At the end of the day, this is a python API problem, ideally
> we'd solve it there and not affect the C or other language tools.
I disagree. There is no guarantee that a program's original argv[0]
pointer needs to point at a stable string.
> Does the patch attached resolve the problem you observed?
I'm not sure it's sufficient, and IMHO the patch is worse: it still
introduces the same kind of leak, and is now far removed from the
place in libpcp that holds onto that pointer and makes the poor
assumption.
> Do we have QA coverage of the problem with those existing tests I
> wrote (above) or do we need to add something more? (any hints re
> increasing chances of hitting it?)
You could strcpy(argv[0], "bozo the clown"); in a C PMAPI program
sometime after __pmSetProgname().
The only other patch in fche/for-merge was this:
commit e54d33f7fde4208a01107cd807292e9aaeffd3f6
Author: Frank Ch. Eigler <fche@xxxxxxxxxx>
Date: Sat Apr 12 14:32:46 2014 -0400
pcp.sh: add Available Commands: list to usage message, like already
documented
I see that you have your own version in /dev now. FWIW, I like this
one better because its logic matches that of what is done later in
pcp.sh, namely the ordering and executability checking, and does
not litter in $tmp. YMMV, HTH, HAND, & c.
- FChE
|