When pmlogger_check.sh was relocated recently in the source tree, all
the revision history was lost. Is it possible to revert 499b393 and
redo it in a way that keeps the revision history with the file, or is
this a git "feature"?
Anyway, the real issue here is commit dc62541 that added pmlogconf to
pmlogger_check.sh (I have not checked but suspect the same may apply to
the related changes made to the pmie control scripts). Deep inside
pmlogger_check I found this
if $PMLOGCONF -q -h $hostname $tmp/pmlogger
now pmlogconf is designed to be interactive, so what really happens here
depends on where stdin is coming from. As this is run from cron usually
(but not always), that is likely to be /dev/null and we get a sort of
default configuration file generated.
Now, what if the pmlogger configuration file was already crafted by hand
using pmlogconf and carefully selecting groups of metrics to be logged?
Along comes pmlogger_check and *whack* your pmlogger config file is
changed from what you really wanted to something "defaulty". This
happens silently. So the sysadmin only finds out when they go to look
at an archive to solve a problem ... *honk* no cigar.
This is not a hypothetical Dr No post, it just happened to me on the
logging farm for 32 production machines and the road to recovery is not
pretty. Fortunately (!) we had a system crash soon after so someone was
looking at the logs, otherwise it could have been weeks before the
snarfoo was noticed.
We need to be a lot smarter about how "automated" stuff is done ... I
don't know how to resolve this particular case but the status quo is not
even close to acceptable.
|