Hi guys,
This patch fixes the precision of the allcpu and percpu millisecond
counters. This has been the source of problems in the past, and it
looks like we missed another size transition awhile back in 2.6.5+.
Historically, 2.4 and earlier kernels used a funky int/long mix for
idle and other CPU times. Some code was put in to try to circumvent
counter wrapping, in the agent, back then (hohum). Then, in 2.6.0,
though 2.6.4 (inclusive) all CPU time counters were made 32 bits for
all platforms (argh!). Then, in 2.6.5 (and beyond) _all_ CPU time
counters got changed to be 64 bits unconditionally.
At least someone finally got it right. :] However, we missed this
last transition to 64 bits, and the Linux agent hasn't been updated.
This patch does that, and dynamically sets the type of the CPU time
metrics depending on the 3 kernel versions/flavours.
In the process of fixing this, it became clear that the wrap handling
was going to be extremely hard to get right for all cases (it is wrong
now, after the kernel type changes a few years back), so I removed it
completely. I don't think this will affect anyone in practice.
I noticed the context switch count and interrupt count are also not
being exported correctly, so I fixed those up at the same time (these
ones seem to be always 32 bits on 2.4 and always 64 bits on 2.6).
Finally, theres a new CPU time being accounted in recent 2.6 kernels
("steal") - I've not updated the agent to export that as yet (it is
always zero on my boxen).
cheers.
--
Nathan
fix-linux-percpu-metrics
Description: Text Data
|