[pcp] zbxpcp: tolerate pmcd restarts
Marko Myllynen
myllynen at redhat.com
Wed Jan 27 04:16:56 CST 2016
Hi,
As said earlier [1] it's not possible to start zabbix-agent before pmcd
but it's possible for zbxpcp to recover from pmcd restarts with
pmReconnectContext(3), so let's do that.
1) http://oss.sgi.com/pipermail/pcp/2016-January/009408.html
Update zbxpcp(3) accordingly and make it follow the PCP man page style.
---
src/zabbix-agent/src/zbxpcp.3 | 29 ++++++++++++++++-------------
src/zabbix-agent/src/zbxpcp.c | 16 +++++++++++-----
2 files changed, 27 insertions(+), 18 deletions(-)
diff --git a/src/zabbix-agent/src/zbxpcp.3 b/src/zabbix-agent/src/zbxpcp.3
index dcc41ee..8445d65 100644
--- a/src/zabbix-agent/src/zbxpcp.3
+++ b/src/zabbix-agent/src/zbxpcp.3
@@ -27,20 +27,23 @@ With the
module configured in
.I zabbix_agentd.conf
all the PCP metrics are available from the Zabbix agent like any other
-agent items. As a loadable module (DSO)
+agent items.
+As a loadable module (DSO)
.B zbxpcp
does not rely on any external programs but directly uses the PCP APIs to
fetch PCP metrics when requested.
.PP
A typical PCP installation on Linux offers over 1000 metrics by default
and is in turn extensible with its own plugins, or PMDAs (``Performance
-Metrics Domain Agents''). In addition to very complete
+Metrics Domain Agents'').
+In addition to very complete
.I /proc
based statistics, readily available PCP PMDAs provide support for such
system and application level components as Apache, CIFS, 389 Directory
Server, GFS2, Gluster, InfiniBand, KVM, MySQL, NFS, Postfix, PostgreSQL,
-Samba, and Sendmail, among others. In addition to Linux, PCP also runs
-on Mac OS X, FreeBSD, NetBSD, Solaris, and Windows.
+Samba, and Sendmail, among others.
+In addition to Linux, PCP also runs on Mac OS X, FreeBSD, NetBSD,
+Solaris, and Windows.
.PP
For PCP introduction, see
.BR PCPIntro (1).
@@ -52,8 +55,8 @@ is available at the PCP home page http://pcp.io/.
.PP
For general information about Zabbix data collection and loadable
modules, see
-https://www.zabbix.com/documentation/3.0/manual/config/items. For Zabbix
-introduction and downloads, see http://www.zabbix.com/.
+https://www.zabbix.com/documentation/3.0/manual/config/items.
+For Zabbix introduction and downloads, see http://www.zabbix.com/.
.PP
.B zbxpcp
is compatible with the Zabbix module API version
@@ -62,7 +65,8 @@ is compatible with the Zabbix module API version
First make sure PCP is installed and configured properly, see the above
references for instructions and use for example
.BR pminfo (1)
-to make sure the PCP metrics can be fetched. To enable the
+to make sure the PCP metrics can be fetched.
+To enable the
.B zbxpcp
loadable module in a Zabbix agent, the following lines must be added to
the Zabbix agent configuration file
@@ -82,8 +86,8 @@ LoadModule=zbxpcp.so
After restarting the Zabbix agent all the PCP metrics will be available
with the ``\c
.BR pcp. ''
-prefix like all the other agent items. This can be verified with the
-commands:
+prefix like all the other agent items.
+This can be verified with the commands:
.RS +4
.ft CW
@@ -104,10 +108,9 @@ The PCP
service must always be running when starting up a
.B zbxpcp
enabled Zabbix agent, otherwise the module will fail to load and the PCP
-metrics will not become available. Special care must be taken to make
-sure this happens also when rebooting the system. A
-.B pmcd
-restart needs to be followed by a Zabbix agent restart.
+metrics will not become available.
+Special care must be taken to make sure this happens also when rebooting
+the system.
.SH FILES
.PD 0
.TP 10
diff --git a/src/zabbix-agent/src/zbxpcp.c b/src/zabbix-agent/src/zbxpcp.c
index 40e28c2..4651726 100644
--- a/src/zabbix-agent/src/zbxpcp.c
+++ b/src/zabbix-agent/src/zbxpcp.c
@@ -19,6 +19,7 @@
/*
* TODO
+ * - reconnection interval support
* - derived metrics support
* - config file support
* - conn type
@@ -48,13 +49,13 @@
*/
static int ctx = -1;
-int zbx_module_pcp_init()
+int zbx_module_pcp_connect()
{
ctx = pmNewContext(PM_CONTEXT_HOST, "localhost");
return ctx;
}
-int zbx_module_pcp_uninit()
+int zbx_module_pcp_disconnect()
{
return pmDestroyContext(ctx);
}
@@ -64,7 +65,7 @@ int zbx_module_pcp_uninit()
*/
int zbx_module_init()
{
- if (zbx_module_pcp_init() < 0)
+ if (zbx_module_pcp_connect() < 0)
return ZBX_MODULE_FAIL;
return ZBX_MODULE_OK;
}
@@ -104,7 +105,7 @@ void zbx_module_item_timeout(int timeout)
int zbx_module_uninit()
{
- if (zbx_module_pcp_uninit() != 0)
+ if (zbx_module_pcp_disconnect() != 0)
return ZBX_MODULE_FAIL;
return ZBX_MODULE_OK;
}
@@ -186,8 +187,13 @@ int zbx_module_pcp_fetch_metric(AGENT_REQUEST *request, AGENT_RESULT *result)
break;
}
- /* Preparations and sanity checks. */
+ /* Try to reconnect if the initial lookup fails. */
sts = pmLookupName(1, metric, pmid);
+ if (sts < 0)
+ pmReconnectContext(ctx);
+
+ /* Preparations and sanity checks. */
+ if (sts < 0) sts = pmLookupName(1, metric, pmid);
if (sts < 0) return SYSINFO_RET_FAIL;
sts = pmLookupDesc(pmid[0], desc);
if (sts < 0) return SYSINFO_RET_FAIL;
Thanks,
--
Marko Myllynen
More information about the pcp
mailing list