pcp
[Top] [All Lists]

Re: Timezones and archive rotation

To: kenj@xxxxxxxxxxxxxxxx
Subject: Re: Timezones and archive rotation
From: Nathan Scott <nscott@xxxxxxxxxx>
Date: Mon, 12 Oct 2009 12:09:42 +1100 (EST)
Cc: pcp@xxxxxxxxxxx
In-reply-to: <1480161420.401255309160945.JavaMail.root@xxxxxxxxxxxxxxxxxx>
----- "Ken McDonell" <kenj@xxxxxxxxxxxxxxxx> wrote:

> On Fri, 2009-10-09 at 10:40 +1100, Nathan Scott wrote:
> > Hi Ken,
> > 
> > Probably you have a better handle on this stuff than me, so
> > here's a couple of observations/questions on archive rotation.
> > 
> > - should pmlogextract prefer the final archive in its choice
> > of timezone, rather than the first?  [ since daylight savings
> > changes usually/always/sometimes (dunno what other geographies
> > ae like, but I assume they're like .au?) happen early in the
> > morning, and the "standard" (as per man pages) time for doing
> > archive rotation and merging in just after midnight. ]
> 
> This is 100% no-win territory.  It does not matter when you run the
> daily script, the run that spans a timezone change is sort of doomed
> ...
> if you use the timezone from the first archive the times are off
> _after_
> the timezone change, if you use the timezone from the last archive
> the
> times are off _before_ the timezone change.
> 
> I think the status quo is probably the best one could hope for and
> the warnings are legitimate.

I agree the warnings are legitimate.  Probably.  Its unfortunate that
we send so much mail from our start scripts ... this is a regular,
scheduled, well understood occurrence (daylight savings changes), so
it is questionable that we should send mail out about this - it may
cover many hundreds of hosts.  But, *shrug* ... I can live with that.

As to whether we have the ideal status quo, thats more questionable.
I agree theres no one right answer here, but the defaults seem to do
the wrong thing.  Daylight savings always switches over in the early
hours of the morning.  Our daily scripts always run before that, so
we always get the wrong timezone for the day of the switch and hence
reporting with -z always has the maximum number of hours reported in
the wrong zone for that day. If we used the last archive, we would
avoid that issue, I can't see any real downside from doing so.

> > - any thoughts on .0 (data) files being mysteriously removed?
> > pretty sure there's nothing touching our production logs
> > outside of pmlogger_check/daily ... something in there seems
> > to be nuking (/not creating?) .0 files (though .meta & .index
> > files exist - so I suspect its somehow being nuked.  I've seen
> > it a few times, out of the blue, randomly - haven't been able
> > to see any pattern to it though.
> 
> In the output below is this all contiguous from one run?  If so,

Yes, this is all one mail message from one pmlogger_daily run.
This particular host is monitoring 24 other hosts in that data
centre, so the pmlogger_daily run probably takes a few minutes.

> Oct 9 merging the previous day's archives complain about timezone
> changes ... but Oct 8/9 is not a timezone change day for Oz ... ???

That's because pmcd.timezone was b0rked and didn't ever actually
change (3.0.0 is fixed).  This one was a manual pmcd restart on our
systems to pick up the daylight savings change ... a few days after
the actual change.

> Now the missing archive is from a previously unmerged set of archives
> from Oct 2 ... I have no explanation for this ... I can see how it
> might
> be possible to have a missing file from an archive that is exactly
> $CULLAFTER days old, but this does not match your observation.

Oh BTW, we run pmlogger_daily with options: "-k 7 -x 2" FWIW.  Unsure
if thats relevant to this issue ... its intermittent, so difficult to
diagnose.

> I'd suggest using -t 7 to keep the last week's worth of verbose logs
> to help diagnose this.

Will do, forgot about that option - thanks!

cheers.

-- 
Nathan

<Prev in Thread] Current Thread [Next in Thread>