https://bugzilla.redhat.com/show_bug.cgi?id=1065803
--- Comment #12 from Nathan Scott <nathans@xxxxxxxxxx> ---
To clarify, the fix here is to auto-restart agents that are unresponsive, which
is typically due to an unexpected, very large latency during PMDA sampling (and
fixing the source of that latency is outside of PCP, hence intractable).
This is achieved through a combination of pmdaroot starting PMDAs (i.e. set
PMCD_ROOT_AGENT=1 in /etc/sysconfig/pmcd - which is now the default) and:
# chkconfig pmie on
# service pmie start
This enables the pmie rule which checks for agents that have exited, and
automates their restart (within ~5 seconds - with a holdoff of 1 minute after
any such attempt). A message is also logged to syslog at the time a restart is
attempted.
These two components to the fix first came together in pcp-3.11.1, however the
pmie rule could be used in pcp-3.11.0 as well if anyone needs that.
--
You are receiving this mail because:
You are on the CC list for the bug.
Unsubscribe from this bug
https://bugzilla.redhat.com/token.cgi?t=cKVrG20GdY&a=cc_unsubscribe
|