On 14/01/14 14:48, Nathan Scott wrote:
Hi Ken,
As Frank discovered, reported and worked-around in oss
bugzilla #1041 there are pathological configurations for
pmlogger (coming out of pmlogconf thanks to me! *cough*)
which blow out the size of the generated archives.
[ http://oss.sgi.com/bugzilla/show_bug.cgi?id=1041 ]
This resulted from pmlogger logging the same metrics more
than once per sample interval, if they're presented in
different configuration blocks - even if those blocks use
the same interval and permission states as each other.
By "permission" I presume you mean advisory/mandatory?
Franks workaround is simple and effective (which is good,
as we're due for a release), but I wanted to check in and
see if you think we should continue to hack in this area,
as I think we (collectively) probably should.
pmlogger is core technology ... it is never a waste of time trying to
improve things here.
I'd like to make pmlogger set up metric logging tasks more
independently, irrespective of the separate configuration
file blocks and the order in which they are presented - it
seems this can have a big impact currently on how much is
logged, today, which is wrong IMO.
How does the order of the config file blocks impact on "how much is logged"?
Any blocks which have common interval/state ...
"state" == "permission" above?
...could be merged
into a single task_t (dup's removed) and allow the optfetch
code to further optimise the fetches for each group. But,
at the moment this is not possible due to the way pmlogger
forces each config-file-chunk to be a separate task_t.
I started going down the path of attempting to assign each
metric into common tasks, during parsing - using find_task()
which is used when pmlc adds metrics. It's complicated by
global state and the need to delay task_t completion (and
fetch group setup, log_callback registering, etc) until the
entire config file has been parsed. A WIP patch is attached
- lots to do, but it shows the general direction.
Any thoughts? Is there are reason why this wasn't tackled
originally - seems logical to go this way, but it doesn't;
so my spidey-sense is telling me there's some subtle issues
lurking here... :)
Don't have bandwidth to review the patch at this stage I'm afraid, but
the only issue to watch out for is error handling ... I have a vague
feeling that at some point in the past, if there was a problem with the
initial metadata setup (e.g. bad metric name, bad PMID, no pmDesc
available) for one metric in a group then all the metrics in the group
were omitted from the archive.
But I've just done a small experiment and believe this to NOT be the
case, e.g.
log mandatory on once {
sample.bin
}
log mandatory on once {
bogus.metric
}
and
log mandatory on once {
sample.bin
bogus.metric
}
produce the same output archive.
The other reason for multiple groups that I vaguely recall was to
"stagger" the pmFetches but this is a false optimization with current
hardware and networks, and makes interpolation less believable (trust me
on this one).
|