Hi Nathan,
On Tue, May 27, 2014 at 09:08:32PM -0400, Nathan Scott wrote:
> > wondering if there is a better way in general to achieve my goal here
> > (retrieve all values/indoms for all metrics for all timestamps).
>
> This class of problem sounds suited to solving using the same model
> that pmlogsummary uses. It performs sequential result scanning via
> pmFetchArchive(3), with a single pmSetMode at the start to set the
> initial archive offset.
>
> As it passes through the pmResult structures, it constructs a data
> structure alot like the one you describe above (written in C though).
> It uses a hash of all PMIDs (key == PMID, value == "struct aveData")
> wherein each PMID hash value contains a list of all instances that
> grows dynamically as the archive is scanned and new instances found.
>
> Then at the end of scanning the archive, the now in-memory PMID hash
> is walked, final calculations are done, and a report printed out. In
> the end, it doesn't use pmGetInDom[Archive] at all, but instead uses
> pmNameInDom(3).
thanks for the hints. I've now switched to using pmFetchArchive (how did I
not notice this function before is beyond me) and pmNameInDomArchive. So now
the pseudo code is something like the following:
while true:
result = ctx.pmFetchArchive()
for i in range(result.contents.numpmid):
pmid = result.contents.get_pmid(i)
desc = context.pmLookupDesc(pmid)
count = result.contents.get_numval(i)
if count <= 1: # No indoms are present
...extract value...
else:
for j in range(count):
inst = result.contents.get_inst(i, j)
indom_name = context.pmNameInDomArchive(desc, inst)
...extract data..
Since pmNameInDomArchive was quite high up in my profiling
I cached it in a dictionary so that indom_cache[(i, j)] = indom_name.
This way I only look it up when the metric appears the first time, and
I shave off 40% of the time needed to parse this (the rest is dominated
by python casts and by pmExtractValue calls, for which there are less
obvious ways to improve). Is this a safe thing to do? Am I guaranteed
that the mapping (i, j)->indom_name will stay the same in an archive?
Somehow I assume that is not the case (pmcd restart with new PMDA, etc.),
but maybe I'll get lucky ;)
cheers,
Michele
--
Michele Baldessari <michele@xxxxxxxxxx>
C2A5 9DA3 9961 4FFB E01B D0BC DDD4 DCCB 7515 5C6D
|