Hi guys,
----- "Ken McDonell" <kenj@xxxxxxxxxxxxxxxx> wrote:
> I suspect we can solve this problem with a large hammer that will
> work on all platforms (Greg's suggestion won't apparently work on
> Mac OS X).
>
> If the time ever goes backwards (not a small ntp adjustment, I mean
> now - last_delay is < now at the last sample, so time went back more
> than 2*last_delay), there is no point keeping any history and any
> notion of a "rate" over the last interval is pretty wobbly.
> ..
We've now observed a few occurrences of the log entries below in our
production environment... this is a localhost pmie (i.e. running on
the host its monitoring, no pmcd restart or anything here ... it just
spontaneously fails, occassionally, afaict):
Log for pmie on [HOST] rotated Thu Jul 22 00:14:01 2010
pmie: PID = 25944, default host = [HOST]
Thu Jul 22 08:14:13 2010: High load: 5.8 seconds/seconds CPU time
sleepTight: negative delay (-14.907716). sched=1279750458.049223,
cur=1279750472.956939
sleepTight: invalid args: -14 -907716035
sleepTight: negative delay (-9.929606). sched=1279750463.049223,
cur=1279750472.978829
sleepTight: invalid args: -9 -929605960
sleepTight: negative delay (-4.950649). sched=1279750468.049223,
cur=1279750472.999872
sleepTight: invalid args: -4 -950649023
Thu Jul 22 09:50:33 2010: High load: 4.35 seconds/seconds CPU time
sleepTight: negative delay (-17.284786). sched=1279756238.049223,
cur=1279756255.334009
sleepTight: invalid args: -17 -284785985
sleepTight: negative delay (-12.302463). sched=1279756243.049223,
cur=1279756255.351686
sleepTight: invalid args: -12 -302463054
sleepTight: negative delay (-7.319972). sched=1279756248.049223,
cur=1279756255.369195
sleepTight: invalid args: -7 -319972038
sleepTight: negative delay (-2.337748). sched=1279756253.049223,
cur=1279756255.386971
sleepTight: invalid args: -2 -337748050
Thu Jul 22 09:55:48 2010: Viewer conversion completed, cleaning up
Thu Jul 22 09:56:48 2010: Key Lock wait time 1.48 6.9 1.17 3.75 0.0032164
sleepTight: negative delay (-2.482367). sched=1279756613.049223,
cur=1279756615.531590
sleepTight: invalid args: -2 -482367038
Thu Jul 22 10:32:48 2010: Key Lock wait time 4.88 2.28 0.97 8.9 2.72
sleepTight: negative delay (-2.600906). sched=1279758773.049223,
cur=1279758775.650129
sleepTight: invalid args: -2 -600906133
Thu Jul 22 10:51:18 2010: High load: 4.25 seconds/seconds CPU time
sleepTight: negative delay (-21.399701). sched=1279759883.049223,
cur=1279759904.448924
sleepTight: invalid args: -21 -399701118
sleepTight: negative delay (-16.418470). sched=1279759888.049223,
cur=1279759904.467693
sleepTight: invalid args: -16 -418470144
sleepTight: negative delay (-11.437375). sched=1279759893.049223,
cur=1279759904.486598
sleepTight: invalid args: -11 -437375068
sleepTight: negative delay (-6.456028). sched=1279759898.049223,
cur=1279759904.505251
sleepTight: invalid args: -6 -456027984
sleepTight: negative delay (-1.475107). sched=1279759903.049223,
cur=1279759904.524330
sleepTight: invalid args: -1 -475106954
Its a shame theres no timestamp with those sleepTight lines. This is on 3.3.3.
This host is using ntp, theres no way this is some wierd large intermittent
multi-second changes in the system clock or anything like that.
cheers.
--
Nathan
|