On 06/01/15 09:22, Paul Smith wrote:
From time to time I run into a PCP archive with even the latest pmchart where
scrolling through time seems to make pmchart go into CPU hell. Discussion on
IRC with nathan helped me track down that it is the Load Avg part of the
Overview view that is part of the root cause, something to do with discrete
metrics and hinv.ncpu. If I use the vanilla load average view by itself, it
works fine.
After my hasty reply this morning I've been thinking about this some
more and I think there is more going on (wrong) here than I first thought.
So, Paul, could you please send me an example archive that shows the
problem and the pmchart recipe ... I'll try and intuit the latter,
please fill in the gaps.
1. -c Overview shows the problem, -c Loadavg does not (this does
implicate hinv.ncpu pretty strongly)
2. any -S or -T command line options?
3. do you change the current time using the time control, or just start
playing?
4. any -t option or change in the update interval via the time control?
If it was simply the logged once discrete metric, then
I would expect possibly a scan from the current time to the start of the
time window to find the initial value (if it exists) and then a scan to
the end of the time window to find the subsequent value (if it exists)
... the code as designed is supposed to do this at most once (sweeping
up all similarly unbounded metrics in one scan) ... the implementation
appears to be doing that over and over.
I wonder if this is the same problem Frank reported some months ago that
I never got to the bottom of?
|