pcp
[Top] [All Lists]

Re: Prepare to be assimilated^Wanalysed; resistance is futile

To: Nathan Scott <nathans@xxxxxxxxxx>
Subject: Re: Prepare to be assimilated^Wanalysed; resistance is futile
From: "Frank Ch. Eigler" <fche@xxxxxxxxxx>
Date: Wed, 17 Jul 2013 09:15:37 -0400
Cc: pcp@xxxxxxxxxxx
Delivered-to: pcp@xxxxxxxxxxx
In-reply-to: <444804824.2373005.1374035342123.JavaMail.root@xxxxxxxxxx>
References: <1715044262.9523595.1372389213645.JavaMail.root@xxxxxxxxxx> <y0m4ncfiq4h.fsf@xxxxxxxx> <51D08DEE.6030209@xxxxxxxxxxxxxxxx> <406338386.10303545.1372630273147.JavaMail.root@xxxxxxxxxx> <1251717658.10534278.1372672990990.JavaMail.root@xxxxxxxxxx> <20130702160444.GD19454@xxxxxxxxxx> <399367999.12169937.1372810670160.JavaMail.root@xxxxxxxxxx> <y0moba71pao.fsf@xxxxxxxx> <444804824.2373005.1374035342123.JavaMail.root@xxxxxxxxxx>
User-agent: Mutt/1.4.2.2i
Hi -

> [...] I'm not happy with throwing in the towel on generating good
> configuration files by default though.

OK, as long as we observe the requirement that we do not accidentally
regenerate / modify any files that a sysadmin has created (whether
that was by hand or by a prior interactive pm*conf run).


> [...]
> > comes up, we'd like to start logging it within (say) seconds, rather
> > than up to 30 minutes.  (This could be worked around by hand-invoking
> > the _check* routine upon the arrival of new hosts, though then we have
> > a lot more cpu consumption, and a lot more busy-work checking on other
> > pmloggers.)
> 
> Not convinced its going to cost a whole lot - new hosts do not arrive
> that often - this is a once-in-a-while thing, so your poke-it-directly
> solution above would indeed work in practice.

OK, let's confirm that it's low-cost, and that it actually works if
e.g. the same host comes and goes several times during a day (so new
archives need to be created for each pmlogger launch).


> > Second, there is nothing that handles the disappearance of remote
> > nodes, or equivalently, a sysadmin commenting out lines in
> > pm{logger|ie}/config.default.  The _check* scripts may notice them but
> > don't consider it their problem to kill them.
> 
> *nod* - this problem I have seen in real production environments, and it
> is sorta-handled in a non-intuitive way - as soon as the remote host goes
> away, pmlogger loses the connection and it exits [...]

This sounds like an unfortunate policy, if for example there are
temporary network glitches or a quick reboot.  A 30-minute re-poll is
IMO too slow.


> Another corner case to  worry about is a pmlogger entry that was in the
> control file, but later removed (via sysadmin) - this process no longer
> tracked and will not be stopped/log-rotated.  In the Aconex environment,
> this potential issue was combated via a dead-hand timer approach using
> the -T option to pmlogger.

Given that pmlogger_daily is also a default-on cron job now (right?),
perhaps this -T flag should be a default to those instances invoked by
pmlogger_check.

Another issue is cleanup of the archives left over by prior logger
targets.


> Moving along with all this, my current thinking is to continue on
> with testing the code in the dev branch, and use that as the basis
> of the next release.  [...]

OK.  I'll belay my parallel efforts and return to the config.d idea.


- FChE

<Prev in Thread] Current Thread [Next in Thread>