pcp
[Top] [All Lists]

Re: [pcp] pmie and system clock changes

To: kenj@xxxxxxxxxxxxxxxx
Subject: Re: [pcp] pmie and system clock changes
From: nathans@xxxxxxxxxx
Date: Thu, 22 Jul 2010 12:05:40 +1000 (EST)
Cc: pcp@xxxxxxxxxxx, Martin Hicks <mort@xxxxxxx>
In-reply-to: <1759478894.1131561279763520887.JavaMail.root@xxxxxxxxxxxxxxxxxx>
Sender: nscott@xxxxxxxxxx
Hi guys,

----- "Ken McDonell" <kenj@xxxxxxxxxxxxxxxx> wrote:

> I suspect we can solve this problem with a large hammer that will
> work on all platforms (Greg's suggestion won't apparently work on
> Mac OS X).
> 
> If the time ever goes backwards (not a small ntp adjustment, I mean
> now - last_delay is < now at the last sample, so time went back more
> than 2*last_delay), there is no point keeping any history and any
> notion of a "rate" over the last interval is pretty wobbly.
> ..

We've now observed a few occurrences of the log entries below in our
production environment... this is a localhost pmie (i.e. running on
the host its monitoring, no pmcd restart or anything here ... it just
spontaneously fails, occassionally, afaict):

Log for pmie on [HOST] rotated Thu Jul 22 00:14:01 2010

pmie: PID = 25944, default host = [HOST]

Thu Jul 22 08:14:13 2010: High load: 5.8 seconds/seconds CPU time
sleepTight: negative delay (-14.907716). sched=1279750458.049223, 
cur=1279750472.956939
sleepTight: invalid args: -14 -907716035
sleepTight: negative delay (-9.929606). sched=1279750463.049223, 
cur=1279750472.978829
sleepTight: invalid args: -9 -929605960
sleepTight: negative delay (-4.950649). sched=1279750468.049223, 
cur=1279750472.999872
sleepTight: invalid args: -4 -950649023
Thu Jul 22 09:50:33 2010: High load: 4.35 seconds/seconds CPU time
sleepTight: negative delay (-17.284786). sched=1279756238.049223, 
cur=1279756255.334009
sleepTight: invalid args: -17 -284785985
sleepTight: negative delay (-12.302463). sched=1279756243.049223, 
cur=1279756255.351686
sleepTight: invalid args: -12 -302463054
sleepTight: negative delay (-7.319972). sched=1279756248.049223, 
cur=1279756255.369195
sleepTight: invalid args: -7 -319972038
sleepTight: negative delay (-2.337748). sched=1279756253.049223, 
cur=1279756255.386971
sleepTight: invalid args: -2 -337748050
Thu Jul 22 09:55:48 2010: Viewer conversion completed, cleaning up
Thu Jul 22 09:56:48 2010: Key Lock wait time 1.48 6.9 1.17 3.75 0.0032164
sleepTight: negative delay (-2.482367). sched=1279756613.049223, 
cur=1279756615.531590
sleepTight: invalid args: -2 -482367038
Thu Jul 22 10:32:48 2010: Key Lock wait time 4.88 2.28 0.97 8.9 2.72
sleepTight: negative delay (-2.600906). sched=1279758773.049223, 
cur=1279758775.650129
sleepTight: invalid args: -2 -600906133
Thu Jul 22 10:51:18 2010: High load: 4.25 seconds/seconds CPU time
sleepTight: negative delay (-21.399701). sched=1279759883.049223, 
cur=1279759904.448924
sleepTight: invalid args: -21 -399701118
sleepTight: negative delay (-16.418470). sched=1279759888.049223, 
cur=1279759904.467693
sleepTight: invalid args: -16 -418470144
sleepTight: negative delay (-11.437375). sched=1279759893.049223, 
cur=1279759904.486598
sleepTight: invalid args: -11 -437375068
sleepTight: negative delay (-6.456028). sched=1279759898.049223, 
cur=1279759904.505251
sleepTight: invalid args: -6 -456027984
sleepTight: negative delay (-1.475107). sched=1279759903.049223, 
cur=1279759904.524330
sleepTight: invalid args: -1 -475106954

Its a shame theres no timestamp with those sleepTight lines.  This is on 3.3.3.
This host is using ntp, theres no way this is some wierd large intermittent
multi-second changes in the system clock or anything like that.

cheers.

-- 
Nathan

<Prev in Thread] Current Thread [Next in Thread>