pcp
[Top] [All Lists]

zbxpcp: tolerate pmcd restarts

To: pcp developers <pcp@xxxxxxxxxxx>
Subject: zbxpcp: tolerate pmcd restarts
From: Marko Myllynen <myllynen@xxxxxxxxxx>
Date: Wed, 27 Jan 2016 12:16:56 +0200
Delivered-to: pcp@xxxxxxxxxxx
Organization: Red Hat
Reply-to: Marko Myllynen <myllynen@xxxxxxxxxx>
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.5.0
Hi,

As said earlier [1] it's not possible to start zabbix-agent before pmcd
but it's possible for zbxpcp to recover from pmcd restarts with
pmReconnectContext(3), so let's do that.

1) http://oss.sgi.com/pipermail/pcp/2016-January/009408.html

Update zbxpcp(3) accordingly and make it follow the PCP man page style.

---
 src/zabbix-agent/src/zbxpcp.3 | 29 ++++++++++++++++-------------
 src/zabbix-agent/src/zbxpcp.c | 16 +++++++++++-----
 2 files changed, 27 insertions(+), 18 deletions(-)

diff --git a/src/zabbix-agent/src/zbxpcp.3 b/src/zabbix-agent/src/zbxpcp.3
index dcc41ee..8445d65 100644
--- a/src/zabbix-agent/src/zbxpcp.3
+++ b/src/zabbix-agent/src/zbxpcp.3
@@ -27,20 +27,23 @@ With the
 module configured in
 .I zabbix_agentd.conf
 all the PCP metrics are available from the Zabbix agent like any other
-agent items. As a loadable module (DSO)
+agent items.
+As a loadable module (DSO)
 .B zbxpcp
 does not rely on any external programs but directly uses the PCP APIs to
 fetch PCP metrics when requested.
 .PP
 A typical PCP installation on Linux offers over 1000 metrics by default
 and is in turn extensible with its own plugins, or PMDAs (``Performance
-Metrics Domain Agents''). In addition to very complete
+Metrics Domain Agents'').
+In addition to very complete
 .I /proc
 based statistics, readily available PCP PMDAs provide support for such
 system and application level components as Apache, CIFS, 389 Directory
 Server, GFS2, Gluster, InfiniBand, KVM, MySQL, NFS, Postfix, PostgreSQL,
-Samba, and Sendmail, among others. In addition to Linux, PCP also runs
-on Mac OS X, FreeBSD, NetBSD, Solaris, and Windows.
+Samba, and Sendmail, among others.
+In addition to Linux, PCP also runs on Mac OS X, FreeBSD, NetBSD,
+Solaris, and Windows.
 .PP
 For PCP introduction, see
 .BR PCPIntro (1).
@@ -52,8 +55,8 @@ is available at the PCP home page http://pcp.io/.
 .PP
 For general information about Zabbix data collection and loadable
 modules, see
-https://www.zabbix.com/documentation/3.0/manual/config/items. For Zabbix
-introduction and downloads, see http://www.zabbix.com/.
+https://www.zabbix.com/documentation/3.0/manual/config/items.
+For Zabbix introduction and downloads, see http://www.zabbix.com/.
 .PP
 .B zbxpcp
 is compatible with the Zabbix module API version
@@ -62,7 +65,8 @@ is compatible with the Zabbix module API version
 First make sure PCP is installed and configured properly, see the above
 references for instructions and use for example
 .BR pminfo (1)
-to make sure the PCP metrics can be fetched. To enable the
+to make sure the PCP metrics can be fetched.
+To enable the
 .B zbxpcp
 loadable module in a Zabbix agent, the following lines must be added to
 the Zabbix agent configuration file
@@ -82,8 +86,8 @@ LoadModule=zbxpcp.so
 After restarting the Zabbix agent all the PCP metrics will be available
 with the ``\c
 .BR pcp. ''
-prefix like all the other agent items. This can be verified with the
-commands:
+prefix like all the other agent items.
+This can be verified with the commands:
 
 .RS +4
 .ft CW
@@ -104,10 +108,9 @@ The PCP
 service must always be running when starting up a
 .B zbxpcp
 enabled Zabbix agent, otherwise the module will fail to load and the PCP
-metrics will not become available. Special care must be taken to make
-sure this happens also when rebooting the system. A
-.B pmcd
-restart needs to be followed by a Zabbix agent restart.
+metrics will not become available.
+Special care must be taken to make sure this happens also when rebooting
+the system.
 .SH FILES
 .PD 0
 .TP 10
diff --git a/src/zabbix-agent/src/zbxpcp.c b/src/zabbix-agent/src/zbxpcp.c
index 40e28c2..4651726 100644
--- a/src/zabbix-agent/src/zbxpcp.c
+++ b/src/zabbix-agent/src/zbxpcp.c
@@ -19,6 +19,7 @@
 
 /*
  * TODO
+ * - reconnection interval support
  * - derived metrics support
  * - config file support
  *   - conn type
@@ -48,13 +49,13 @@
  */
 static int ctx = -1;
 
-int zbx_module_pcp_init()
+int zbx_module_pcp_connect()
 {
     ctx = pmNewContext(PM_CONTEXT_HOST, "localhost");
     return ctx;
 }
 
-int zbx_module_pcp_uninit()
+int zbx_module_pcp_disconnect()
 {
     return pmDestroyContext(ctx);
 }
@@ -64,7 +65,7 @@ int zbx_module_pcp_uninit()
  */
 int zbx_module_init()
 {
-    if (zbx_module_pcp_init() < 0)
+    if (zbx_module_pcp_connect() < 0)
         return ZBX_MODULE_FAIL;
     return ZBX_MODULE_OK;
 }
@@ -104,7 +105,7 @@ void zbx_module_item_timeout(int timeout)
 
 int zbx_module_uninit()
 {
-    if (zbx_module_pcp_uninit() != 0)
+    if (zbx_module_pcp_disconnect() != 0)
         return ZBX_MODULE_FAIL;
     return ZBX_MODULE_OK;
 }
@@ -186,8 +187,13 @@ int zbx_module_pcp_fetch_metric(AGENT_REQUEST *request, 
AGENT_RESULT *result)
             break;
     }
 
-    /* Preparations and sanity checks.  */
+    /* Try to reconnect if the initial lookup fails.  */
     sts = pmLookupName(1, metric, pmid);
+    if (sts < 0)
+        pmReconnectContext(ctx);
+
+    /* Preparations and sanity checks.  */
+    if (sts < 0) sts = pmLookupName(1, metric, pmid);
     if (sts < 0) return SYSINFO_RET_FAIL;
     sts = pmLookupDesc(pmid[0], desc);
     if (sts < 0) return SYSINFO_RET_FAIL;

Thanks,

-- 
Marko Myllynen

<Prev in Thread] Current Thread [Next in Thread>