From esafe@bapco.net Fri Nov 17 01:39:16 2006 Received: with ECARTIS (v1.0.0; list pcp); Fri, 17 Nov 2006 01:39:25 -0800 (PST) Received: from cuda.sgi.com (cuda2.sgi.com [192.48.168.29]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id kAH9dDaG011044 for ; Fri, 17 Nov 2006 01:39:15 -0800 X-ASG-Debug-ID: 1163755699-6161-390-0 X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from REFGFI01.BAH.BAPCO.INT (unknown [82.194.46.27]) by cuda.sgi.com (Spam Firewall) with ESMTP id 1661DD1D883C for ; Fri, 17 Nov 2006 01:28:20 -0800 (PST) Received: from refesg01.bah.bapco.int ([10.1.1.224]) by REFGFI01.BAH.BAPCO.INT with Microsoft SMTPSVC(6.0.3790.1830); Fri, 17 Nov 2006 12:31:45 +0300 From: esafe@bapco.net To: pcp@oss.sgi.com X-ASG-Orig-Subj: Alert from eSafe: file.zip\file.scr Infected with Win32.Mydoom.m Subject: Alert from eSafe: file.zip\file.scr Infected with Win32.Mydoom.m Date: Sun, 17 Nov 2006 12:31:45 +0300 MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Message-ID: X-OriginalArrivalTime: 17 Nov 2006 09:31:45.0801 (UTC) FILETIME=[32F30F90:01C70A2B] X-Barracuda-Spam-Score: 0.55 X-Barracuda-Spam-Status: No, SCORE=0.55 using per-user scores of TAG_LEVEL=3.5 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=9.0 tests=NO_REAL_NAME X-Barracuda-Spam-Report: Code version 3.02, rules version 3.0.26249 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- 0.55 NO_REAL_NAME From: does not include a real name X-archive-position: 577 X-ecartis-version: Ecartis v1.0.0 Sender: pcp-bounce@oss.sgi.com Errors-to: pcp-bounce@oss.sgi.com X-original-sender: esafe@bapco.net Precedence: bulk X-list: pcp ====================================================== Bapco detected hostile or unwanted content in this message. If you believe this is in error, please resend the whole message to: bapcoitd@gmail.com Please make sure that you specify the recipient email address(es) in your message. Your email will be manually inspected and if found to be safe and in accordance with Bapco's email policy, itwill be forwarded to the intended recipient. ====================================================== Time: 17 Nov 2006 12:31:45 Scan result: Mail modified to remove malicious content Protocol: SMTP in File Name\Mail Subject: Returned mail: Data format error Source: 81.119.108.195 Destination: Mail Sender: pcp@oss.sgi.com Mail Recipients: charles_edgar@bapco.net Details: file.zip\file.scr Infected with Win32.Mydoom.m, Blocked From nscott@aconex.com Sun Nov 19 21:39:12 2006 Received: with ECARTIS (v1.0.0; list pcp); Sun, 19 Nov 2006 21:39:28 -0800 (PST) Received: from prod.aconex.com (mail.app.aconex.com [203.89.192.138]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id kAK5dAaG020689 for ; Sun, 19 Nov 2006 21:39:11 -0800 Received: from page.mel.office.aconex.com (unknown [192.168.0.210]) by prod.aconex.com (Postfix) with ESMTP id 2164428C7A; Mon, 20 Nov 2006 16:09:58 +1100 (EST) Received: from localhost (page.mel.aconex.com [127.0.0.1]) by page.mel.office.aconex.com (Postfix) with ESMTP id 0756553403A; Mon, 20 Nov 2006 16:09:58 +1100 (EST) Received: from page.mel.office.aconex.com ([127.0.0.1]) by localhost (mail.aconex.com [127.0.0.1]) (amavisd-new, port 10024) with LMTP id 28591-01-25; Mon, 20 Nov 2006 16:09:56 +1100 (EST) Received: from edge (unknown [192.168.0.246]) by page.mel.office.aconex.com (Postfix) with ESMTP id DEFAC534039; Mon, 20 Nov 2006 16:09:56 +1100 (EST) Subject: [PATCH 07/12] Fix distro reporting on Debian From: Nathan Scott Reply-To: nscott@aconex.com To: Michael Newton , Mark Goodwin Cc: pcp@oss.sgi.com Content-Type: text/plain Organization: Aconex Date: Mon, 20 Nov 2006 16:10:34 +1100 Message-Id: <1163999434.4695.238.camel@edge> Mime-Version: 1.0 X-Mailer: Evolution 2.6.3 Content-Transfer-Encoding: 7bit X-archive-position: 582 X-ecartis-version: Ecartis v1.0.0 Sender: pcp-bounce@oss.sgi.com Errors-to: pcp-bounce@oss.sgi.com X-original-sender: nscott@aconex.com Precedence: bulk X-list: pcp This patch adds support for the Debian distribution into the Linux PMDA, alongside the existing RH/SuSE values. -- Nathan Index: devel-pcp-2.5.99/src/pmdas/linux/pmda.c =================================================================== --- devel-pcp-2.5.99.orig/src/pmdas/linux/pmda.c 2006-11-20 11:28:56.653481750 +1100 +++ devel-pcp-2.5.99/src/pmdas/linux/pmda.c 2006-11-20 11:40:33.657041750 +1100 @@ -3896,8 +3896,10 @@ linux_fetchCallBack(pmdaMetric *mdesc, u * more */ struct stat sbuf; - int r, fd = -1; + int r, fd = -1, len = 0; + char prefix[16]; char *rfiles[] = { + "/etc/debian_version", "/etc/fedora-release", "/etc/redhat-release", "/etc/SuSE-release", @@ -3909,14 +3911,19 @@ linux_fetchCallBack(pmdaMetric *mdesc, u } } if (fd != -1) { + if (r == 0) { /* Debian, needs prefix */ + strncpy(prefix, "Debian ", sizeof(prefix)); + len = 7; + } /* * at this point, assume sbuf is good and file contains * the string we want, probably with a \n terminator */ - distro_name = (char *)malloc((int)sbuf.st_size+1); - + distro_name = (char *)malloc(len + (int)sbuf.st_size + 1); if (distro_name != NULL) { - r = read(fd, distro_name, (int)sbuf.st_size); + if (len) + strncpy(distro_name, prefix, len); + r = read(fd, distro_name + len, (int)sbuf.st_size); close(fd); if (r <= 0) { free (distro_name); From nscott@aconex.com Sun Nov 19 21:39:09 2006 Received: with ECARTIS (v1.0.0; list pcp); Sun, 19 Nov 2006 21:39:30 -0800 (PST) Received: from prod.aconex.com (mail.app.aconex.com [203.89.192.138]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id kAK5d8aG020654 for ; Sun, 19 Nov 2006 21:39:09 -0800 Received: from page.mel.office.aconex.com (unknown [192.168.0.210]) by prod.aconex.com (Postfix) with ESMTP id E92E828C89; Mon, 20 Nov 2006 16:08:58 +1100 (EST) Received: from localhost (page.mel.aconex.com [127.0.0.1]) by page.mel.office.aconex.com (Postfix) with ESMTP id BF6C65340F5; Mon, 20 Nov 2006 16:08:58 +1100 (EST) Received: from page.mel.office.aconex.com ([127.0.0.1]) by localhost (mail.aconex.com [127.0.0.1]) (amavisd-new, port 10024) with LMTP id 28591-01-18; Mon, 20 Nov 2006 16:08:57 +1100 (EST) Received: from edge (unknown [192.168.0.246]) by page.mel.office.aconex.com (Postfix) with ESMTP id 4C74E53403A; Mon, 20 Nov 2006 16:08:56 +1100 (EST) Subject: [PATCH 04/12] Fix pagefault metrics From: Nathan Scott Reply-To: nscott@aconex.com To: Michael Newton , Mark Goodwin Cc: pcp@oss.sgi.com Content-Type: text/plain Organization: Aconex Date: Mon, 20 Nov 2006 16:09:34 +1100 Message-Id: <1163999374.4695.235.camel@edge> Mime-Version: 1.0 X-Mailer: Evolution 2.6.3 Content-Transfer-Encoding: 7bit X-archive-position: 585 X-ecartis-version: Ecartis v1.0.0 Sender: pcp-bounce@oss.sgi.com Errors-to: pcp-bounce@oss.sgi.com X-original-sender: nscott@aconex.com Precedence: bulk X-list: pcp The page fault metrics in the Linux PMDA are maintained as counters in the kernel, but we're currently exporting them as discrete values. This means they're not handled correctly in most client tools. -- Nathan Index: devel-pcp-2.5.99/src/pmdas/linux/pmda.c =================================================================== --- devel-pcp-2.5.99.orig/src/pmdas/linux/pmda.c 2006-11-20 11:27:21.083509000 +1100 +++ devel-pcp-2.5.99/src/pmdas/linux/pmda.c 2006-11-20 11:27:38.748613000 +1100 @@ -1142,22 +1142,22 @@ static pmdaMetric metrictab[] = { /* proc.psinfo.minflt */ { NULL, - { PMDA_PMID(CLUSTER_PID_STAT,9), PM_TYPE_U32, PROC_INDOM, PM_SEM_DISCRETE, + { PMDA_PMID(CLUSTER_PID_STAT,9), PM_TYPE_U32, PROC_INDOM, PM_SEM_COUNTER, PMDA_PMUNITS(0,0,0,0,0,0) } }, /* proc.psinfo.cmin_flt */ { NULL, - { PMDA_PMID(CLUSTER_PID_STAT,10), PM_TYPE_U32, PROC_INDOM, PM_SEM_DISCRETE, + { PMDA_PMID(CLUSTER_PID_STAT,10), PM_TYPE_U32, PROC_INDOM, PM_SEM_COUNTER, PMDA_PMUNITS(0,0,0,0,0,0) } }, /* proc.psinfo.maj_flt */ { NULL, - { PMDA_PMID(CLUSTER_PID_STAT,11), PM_TYPE_U32, PROC_INDOM, PM_SEM_DISCRETE, + { PMDA_PMID(CLUSTER_PID_STAT,11), PM_TYPE_U32, PROC_INDOM, PM_SEM_COUNTER, PMDA_PMUNITS(0,0,0,0,0,0) } }, /* proc.psinfo.cmaj_flt */ { NULL, - { PMDA_PMID(CLUSTER_PID_STAT,12), PM_TYPE_U32, PROC_INDOM, PM_SEM_DISCRETE, + { PMDA_PMID(CLUSTER_PID_STAT,12), PM_TYPE_U32, PROC_INDOM, PM_SEM_COUNTER, PMDA_PMUNITS(0,0,0,0,0,0) } }, /* proc.psinfo.utime */ From nscott@aconex.com Sun Nov 19 21:39:09 2006 Received: with ECARTIS (v1.0.0; list pcp); Sun, 19 Nov 2006 21:39:28 -0800 (PST) Received: from prod.aconex.com (mail.app.aconex.com [203.89.192.138]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id kAK5d8aG020655 for ; Sun, 19 Nov 2006 21:39:09 -0800 Received: from page.mel.office.aconex.com (unknown [192.168.0.210]) by prod.aconex.com (Postfix) with ESMTP id E1EF428CA6; Mon, 20 Nov 2006 16:10:50 +1100 (EST) Received: from localhost (page.mel.aconex.com [127.0.0.1]) by page.mel.office.aconex.com (Postfix) with ESMTP id C6E2F53403A; Mon, 20 Nov 2006 16:10:50 +1100 (EST) Received: from page.mel.office.aconex.com ([127.0.0.1]) by localhost (mail.aconex.com [127.0.0.1]) (amavisd-new, port 10024) with LMTP id 29113-01-22; Mon, 20 Nov 2006 16:10:46 +1100 (EST) Received: from edge (unknown [192.168.0.246]) by page.mel.office.aconex.com (Postfix) with ESMTP id CA1785340F3; Mon, 20 Nov 2006 16:10:46 +1100 (EST) Subject: [PATCH 12/12] Add missing include files in man pages From: Nathan Scott Reply-To: nscott@aconex.com To: Michael Newton , Mark Goodwin Cc: pcp@oss.sgi.com Content-Type: text/plain Organization: Aconex Date: Mon, 20 Nov 2006 16:11:24 +1100 Message-Id: <1163999484.4695.244.camel@edge> Mime-Version: 1.0 X-Mailer: Evolution 2.6.3 Content-Transfer-Encoding: 7bit X-archive-position: 581 X-ecartis-version: Ecartis v1.0.0 Sender: pcp-bounce@oss.sgi.com Errors-to: pcp-bounce@oss.sgi.com X-original-sender: nscott@aconex.com Precedence: bulk X-list: pcp Some of the low level impl.h PCP routines are missing a header in their SYNOPSIS section (most are fine, just these three were missing). -- Nathan Index: devel-pcp-2.5.99/man/man3/pmparsectime.3 =================================================================== --- devel-pcp-2.5.99.orig/man/man3/pmparsectime.3 2006-11-20 12:22:17.929549000 +1100 +++ devel-pcp-2.5.99/man/man3/pmparsectime.3 2006-11-20 12:22:30.090309000 +1100 @@ -35,6 +35,8 @@ \f3__pmParseCtime\f1 \- convert \fBctime\fR(3) string to \fBtm\fR structure .SH "C SYNOPSIS" .ft 3 +#include +.br #include .sp int __pmParseCtime(const char *string, struct tm *rslt, char **errmsg) Index: devel-pcp-2.5.99/man/man3/pmparsedebug.3 =================================================================== --- devel-pcp-2.5.99.orig/man/man3/pmparsedebug.3 2006-11-20 12:22:17.945550000 +1100 +++ devel-pcp-2.5.99/man/man3/pmparsedebug.3 2006-11-20 12:22:30.090309000 +1100 @@ -35,6 +35,8 @@ \f3__pmParseDebug\f1 \- convert a list of debug flags into an integer .SH "C SYNOPSIS" .ft 3 +#include +.br #include .sp int __pmParseDebug(const char *spec) Index: devel-pcp-2.5.99/man/man3/pmparsetime.3 =================================================================== --- devel-pcp-2.5.99.orig/man/man3/pmparsetime.3 2006-11-20 12:22:17.973551750 +1100 +++ devel-pcp-2.5.99/man/man3/pmparsetime.3 2006-11-20 12:22:30.090309000 +1100 @@ -35,6 +35,8 @@ \f3__pmParseTime\f1 \- parse time point specification .SH "C SYNOPSIS" .ft 3 +#include +.br #include .sp int __pmParseTime(const char *string, From nscott@aconex.com Sun Nov 19 21:39:12 2006 Received: with ECARTIS (v1.0.0; list pcp); Sun, 19 Nov 2006 21:39:29 -0800 (PST) Received: from prod.aconex.com (mail.app.aconex.com [203.89.192.138]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id kAK5dAaG020687 for ; Sun, 19 Nov 2006 21:39:11 -0800 Received: from page.mel.office.aconex.com (unknown [192.168.0.210]) by prod.aconex.com (Postfix) with ESMTP id BC94728C96; Mon, 20 Nov 2006 16:10:25 +1100 (EST) Received: from localhost (page.mel.aconex.com [127.0.0.1]) by page.mel.office.aconex.com (Postfix) with ESMTP id C9348534039; Mon, 20 Nov 2006 16:10:24 +1100 (EST) Received: from page.mel.office.aconex.com ([127.0.0.1]) by localhost (mail.aconex.com [127.0.0.1]) (amavisd-new, port 10024) with LMTP id 28395-01-25; Mon, 20 Nov 2006 16:10:20 +1100 (EST) Received: from edge (unknown [192.168.0.246]) by page.mel.office.aconex.com (Postfix) with ESMTP id 68FB85340F7; Mon, 20 Nov 2006 16:10:18 +1100 (EST) Subject: [PATCH 08/12] Additional Windows PMDA uname metrics From: Nathan Scott Reply-To: nscott@aconex.com To: Michael Newton , Mark Goodwin Cc: pcp@oss.sgi.com Content-Type: text/plain Organization: Aconex Date: Mon, 20 Nov 2006 16:10:55 +1100 Message-Id: <1163999455.4695.239.camel@edge> Mime-Version: 1.0 X-Mailer: Evolution 2.6.3 Content-Transfer-Encoding: 7bit X-archive-position: 583 X-ecartis-version: Ecartis v1.0.0 Sender: pcp-bounce@oss.sgi.com Errors-to: pcp-bounce@oss.sgi.com X-original-sender: nscott@aconex.com Precedence: bulk X-list: pcp There are several uname-related metrics which we don't export from the Windows agent at the moment. This adds them all in, or placeholders for those I don't quite know how to extract yet. This makes the output from the pcp(1) command make a bit more sense on Windows. -- Nathan Index: devel-pcp-2.5.99/src/pmdas/windows/GNUmakefile =================================================================== --- devel-pcp-2.5.99.orig/src/pmdas/windows/GNUmakefile 2006-11-20 11:44:33.488030250 +1100 +++ devel-pcp-2.5.99/src/pmdas/windows/GNUmakefile 2006-11-20 11:46:05.033751500 +1100 @@ -34,7 +34,7 @@ SHIM_OBJECTS = $(SHIM_CFILES:.c=.obj) LLDLIBS = -lpcp -lpcp_pmda LCFLAGS = -I. PMNS = pmns.disk pmns.kernel pmns.mem pmns.network \ - pmns.sqlserver pmns.filesys pmns.hinv + pmns.sqlserver pmns.filesys pmns.hinv pmns.pmda LSRCFILES = $(SHIM_CFILES) \ Install Remove $(PMNS) root README \ GNUmakefile.install shim.save.uu \ Index: devel-pcp-2.5.99/src/pmdas/windows/README =================================================================== --- devel-pcp-2.5.99.orig/src/pmdas/windows/README 2006-11-20 11:46:29.259265500 +1100 +++ devel-pcp-2.5.99/src/pmdas/windows/README 2006-11-20 11:47:20.222450500 +1100 @@ -13,7 +13,7 @@ Data Helper) APIs. To view the help tex the following command will list all the available metrics and their explanatory "help" text: - $ pminfo -fT windows + $ pminfo -fT kernel disk mem network filesys sqlserver hinv pmda Installation ============ Index: devel-pcp-2.5.99/src/pmdas/windows/Remove =================================================================== --- devel-pcp-2.5.99.orig/src/pmdas/windows/Remove 2006-11-20 11:47:31.467153250 +1100 +++ devel-pcp-2.5.99/src/pmdas/windows/Remove 2006-11-20 11:47:40.031688500 +1100 @@ -37,7 +37,7 @@ iam=windows # has to match top-level names in ./root and be the same as Install # -pmns_name="hinv kernel disk mem network sqlserver filesys" +pmns_name="hinv kernel disk mem network sqlserver filesys pmda" # Do it # Index: devel-pcp-2.5.99/src/pmdas/windows/Install =================================================================== --- devel-pcp-2.5.99.orig/src/pmdas/windows/Install 2006-11-20 11:48:02.189073250 +1100 +++ devel-pcp-2.5.99/src/pmdas/windows/Install 2006-11-20 11:48:08.241451500 +1100 @@ -41,7 +41,7 @@ pmda_interface=3 # has to match top-level names in ./root and be the same as Remove # -pmns_name="hinv kernel disk mem network sqlserver filesys" +pmns_name="hinv kernel disk mem network sqlserver filesys pmda" # Do it # Index: devel-pcp-2.5.99/src/pmdas/windows/data.c =================================================================== --- devel-pcp-2.5.99.orig/src/pmdas/windows/data.c 2006-11-20 11:49:49.043751250 +1100 +++ devel-pcp-2.5.99/src/pmdas/windows/data.c 2006-11-20 11:52:19.825174500 +1100 @@ -624,6 +624,37 @@ static struct { PMDA_PMUNITS(0, 0, 0, 0, 0, 0) }, Q_KERNEL, M_NONE, "" }, +/* kernel.uname.version */ + { { PMDA_PMID(0,111), PM_TYPE_STRING, PM_INDOM_NULL, PM_SEM_DISCRETE, + PMDA_PMUNITS(0,0,0,0,0,0) + }, Q_KERNEL, M_NONE, "" + }, +/* kernel.uname.sysname */ + { { PMDA_PMID(0,112), PM_TYPE_STRING, PM_INDOM_NULL, PM_SEM_DISCRETE, + PMDA_PMUNITS(0,0,0,0,0,0) + }, Q_KERNEL, M_NONE, "" + }, +/* kernel.uname.machine */ + { { PMDA_PMID(0,113), PM_TYPE_STRING, PM_INDOM_NULL, PM_SEM_DISCRETE, + PMDA_PMUNITS(0,0,0,0,0,0) + }, Q_KERNEL, M_NONE, "" + }, +/* kernel.uname.nodename */ + { { PMDA_PMID(0,114), PM_TYPE_STRING, PM_INDOM_NULL, PM_SEM_DISCRETE, + PMDA_PMUNITS(0,0,0,0,0,0) + }, Q_KERNEL, M_NONE, "" + }, + +/* pmda.uname */ + { { PMDA_PMID(0,115), PM_TYPE_STRING, PM_INDOM_NULL, PM_SEM_DISCRETE, + PMDA_PMUNITS(0,0,0,0,0,0) + }, Q_KERNEL, M_NONE, "" + }, +/* pmda.version */ + { { PMDA_PMID(0,116), PM_TYPE_STRING, PM_INDOM_NULL, PM_SEM_DISCRETE, + PMDA_PMUNITS(0,0,0,0,0,0) + }, Q_KERNEL, M_NONE, "" + }, }; Index: devel-pcp-2.5.99/src/pmdas/windows/pmns.kernel =================================================================== --- devel-pcp-2.5.99.orig/src/pmdas/windows/pmns.kernel 2006-11-20 11:58:04.822735500 +1100 +++ devel-pcp-2.5.99/src/pmdas/windows/pmns.kernel 2006-11-20 11:58:54.229823250 +1100 @@ -40,4 +40,8 @@ kernel.all.file { kernel.uname { distro WINDOWS:0:109 release WINDOWS:0:110 + version WINDOWS:0:111 + sysname WINDOWS:0:112 + machine WINDOWS:0:113 + nodename WINDOWS:0:114 } Index: devel-pcp-2.5.99/src/pmdas/windows/pmns.pmda =================================================================== --- /dev/null 1970-01-01 00:00:00.000000000 +0000 +++ devel-pcp-2.5.99/src/pmdas/windows/pmns.pmda 2006-11-20 11:59:07.514653500 +1100 @@ -0,0 +1,4 @@ +pmda { + uname WINDOWS:0:115 + version WINDOWS:0:116 +} Index: devel-pcp-2.5.99/src/pmdas/windows/root =================================================================== --- devel-pcp-2.5.99.orig/src/pmdas/windows/root 2006-11-20 11:59:31.692164500 +1100 +++ devel-pcp-2.5.99/src/pmdas/windows/root 2006-11-20 11:59:45.561031250 +1100 @@ -12,6 +12,7 @@ root { mem network sqlserver + pmda } #include "pmns.hinv" @@ -21,4 +22,5 @@ root { #include "pmns.mem" #include "pmns.network" #include "pmns.sqlserver" +#include "pmns.pmda" From nscott@aconex.com Sun Nov 19 21:39:12 2006 Received: with ECARTIS (v1.0.0; list pcp); Sun, 19 Nov 2006 21:39:25 -0800 (PST) Received: from prod.aconex.com (mail.app.aconex.com [203.89.192.138]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id kAK5dAaG020706 for ; Sun, 19 Nov 2006 21:39:12 -0800 Received: from page.mel.office.aconex.com (unknown [192.168.0.210]) by prod.aconex.com (Postfix) with ESMTP id 2DCC928CB0; Mon, 20 Nov 2006 16:08:50 +1100 (EST) Received: from localhost (page.mel.aconex.com [127.0.0.1]) by page.mel.office.aconex.com (Postfix) with ESMTP id 0D887534039; Mon, 20 Nov 2006 16:08:50 +1100 (EST) Received: from page.mel.office.aconex.com ([127.0.0.1]) by localhost (mail.aconex.com [127.0.0.1]) (amavisd-new, port 10024) with LMTP id 29113-01-8; Mon, 20 Nov 2006 16:08:49 +1100 (EST) Received: from edge (unknown [192.168.0.246]) by page.mel.office.aconex.com (Postfix) with ESMTP id E680B53403A; Mon, 20 Nov 2006 16:08:47 +1100 (EST) Subject: [PATCH 03/12] Fix typos in pmdumptext From: Nathan Scott Reply-To: nscott@aconex.com To: Michael Newton , Mark Goodwin Cc: pcp@oss.sgi.com Content-Type: text/plain Organization: Aconex Date: Mon, 20 Nov 2006 16:09:25 +1100 Message-Id: <1163999365.4695.234.camel@edge> Mime-Version: 1.0 X-Mailer: Evolution 2.6.3 Content-Transfer-Encoding: 7bit X-archive-position: 578 X-ecartis-version: Ecartis v1.0.0 Sender: pcp-bounce@oss.sgi.com Errors-to: pcp-bounce@oss.sgi.com X-original-sender: nscott@aconex.com Precedence: bulk X-list: pcp Fix some harmless typos in the pmdumptext source. -- Nathan Index: dirty-pcp-2.5.99/src/pmdumptext/pmdumptext.c++ =================================================================== --- dirty-pcp-2.5.99.orig/src/pmdumptext/pmdumptext.c++ 2006-11-20 10:59:25.422786750 +1100 +++ dirty-pcp-2.5.99/src/pmdumptext/pmdumptext.c++ 2006-11-20 10:59:52.652488500 +1100 @@ -883,7 +883,7 @@ main(int argc, char *argv[]) pmnsFile = optarg; break; - case 'N': // show normalizartion values + case 'N': // show normalization values normFlag = PMC_true; break; @@ -1209,8 +1209,8 @@ main(int argc, char *argv[]) if (isLive) { gettimeofday(&logStartTime, NULL); - logEndTime.tv_sec = INT_MAX; - logEndTime.tv_usec = INT_MAX; + logEndTime.tv_sec = INT_MAX; + logEndTime.tv_usec = INT_MAX; } else { group->updateBounds(); From nscott@aconex.com Sun Nov 19 21:39:09 2006 Received: with ECARTIS (v1.0.0; list pcp); Sun, 19 Nov 2006 21:39:27 -0800 (PST) Received: from prod.aconex.com (mail.app.aconex.com [203.89.192.138]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id kAK5d8aG020656 for ; Sun, 19 Nov 2006 21:39:09 -0800 Received: from page.mel.office.aconex.com (unknown [192.168.0.210]) by prod.aconex.com (Postfix) with ESMTP id 3C97828B8C; Mon, 20 Nov 2006 16:10:30 +1100 (EST) Received: from localhost (page.mel.aconex.com [127.0.0.1]) by page.mel.office.aconex.com (Postfix) with ESMTP id 1CBDC5340F1; Mon, 20 Nov 2006 16:10:30 +1100 (EST) Received: from page.mel.office.aconex.com ([127.0.0.1]) by localhost (mail.aconex.com [127.0.0.1]) (amavisd-new, port 10024) with LMTP id 28591-01-29; Mon, 20 Nov 2006 16:10:27 +1100 (EST) Received: from edge (unknown [192.168.0.246]) by page.mel.office.aconex.com (Postfix) with ESMTP id A7598534039; Mon, 20 Nov 2006 16:10:27 +1100 (EST) Subject: [PATCH 09/12] Fix Windows filesys metrics From: Nathan Scott Reply-To: nscott@aconex.com To: Michael Newton , Mark Goodwin Cc: pcp@oss.sgi.com, kmcdonell@aconex.com Content-Type: text/plain Organization: Aconex Date: Mon, 20 Nov 2006 16:11:05 +1100 Message-Id: <1163999465.4695.241.camel@edge> Mime-Version: 1.0 X-Mailer: Evolution 2.6.3 Content-Transfer-Encoding: 7bit X-archive-position: 580 X-ecartis-version: Ecartis v1.0.0 Sender: pcp-bounce@oss.sgi.com Errors-to: pcp-bounce@oss.sgi.com X-original-sender: nscott@aconex.com Precedence: bulk X-list: pcp This patch corrects the value reported for the existing filesys.full metric on Windows, and adds in other filesystem fullness related metrics (i.e. capacity/used/free). -- Nathan Index: devel-pcp-2.5.99/src/pmdas/windows/data.c =================================================================== --- devel-pcp-2.5.99.orig/src/pmdas/windows/data.c 2006-11-20 12:01:59.741417000 +1100 +++ devel-pcp-2.5.99/src/pmdas/windows/data.c 2006-11-20 12:02:48.404458250 +1100 @@ -656,6 +656,32 @@ static struct { }, Q_KERNEL, M_NONE, "" }, +/* filesys.capacity */ + { { PMDA_PMID(0,117), PM_TYPE_U64, LDISK_INDOM, PM_SEM_INSTANT, + PMDA_PMUNITS(1,0,0,PM_SPACE_KBYTE,0,0) + }, Q_LDISK, M_NONE, "" + }, +/* filesys.used */ + { { PMDA_PMID(0,118), PM_TYPE_U64, LDISK_INDOM, PM_SEM_INSTANT, + PMDA_PMUNITS(1,0,0,PM_SPACE_KBYTE,0,0) + }, Q_LDISK, M_NONE, "" + }, +/* filesys.free */ + { { PMDA_PMID(0,119), PM_TYPE_U64, LDISK_INDOM, PM_SEM_INSTANT, + PMDA_PMUNITS(1,0,0,PM_SPACE_KBYTE,0,0) + }, Q_LDISK, M_NONE, "" + }, +/* dummy - filesys.free_space */ + { { PMDA_PMID(0,120), PM_TYPE_U32, LDISK_INDOM, PM_SEM_INSTANT, + PMDA_PMUNITS(1,0,0,PM_SPACE_MBYTE,0,0) + }, Q_LDISK, M_NONE, "\\LogicalDisk(*/*#*)\\Free Megabytes" + }, +/* dummy - filesys.free_percent */ + { { PMDA_PMID(0,121), PM_TYPE_FLOAT, LDISK_INDOM, PM_SEM_INSTANT, + PMDA_PMUNITS(0,0,0,0,0,0) + }, Q_LDISK, M_NONE, "\\LogicalDisk(*/*#*)\\% Free Space" + }, + }; int metrictab_sz = sizeof(metricdesc) / sizeof(metricdesc[0]); Index: devel-pcp-2.5.99/src/pmdas/windows/pmda.c =================================================================== --- devel-pcp-2.5.99.orig/src/pmdas/windows/pmda.c 2006-11-20 12:03:43.891926000 +1100 +++ devel-pcp-2.5.99/src/pmdas/windows/pmda.c 2006-11-20 12:09:24.461210250 +1100 @@ -507,27 +507,45 @@ redo_indom(int idx) static void prefetch(int numpmid, pmID pmidlist[]) { - int delta; - int sts; + int delta, numextra = 0; + int sts, i; pmID *dst; + __pmID_int extra[2]; + __pmID_int *pmidp; #ifdef PCP_DEBUG if (pmDebug & DBG_TRACE_APPL2) { - fprintf(stderr, "prefetch(numpid=%d, ...)\n", numpmid); + fprintf(stderr, "prefetch(numpmid=%d, ...)\n", numpmid); } #endif - delta = numpmid * sizeof(pmID) - shm->segment[SEG_SCRATCH].elt_size * shm->segment[SEG_SCRATCH].nelt; + /* we have derived filesys metrics, so may need to fetch more... ugh */ + for (i = 0; i < numpmid; i++) { + pmidp = (__pmID_int *)&pmidlist[i]; + if ((pmidp->cluster == 0) && + (pmidp->item >= 117 && pmidp->item <= 119)) { + extra[0] = extra[1] = *pmidp; + extra[0].item = 120; + extra[1].item = 121; + numextra = 2; + break; + } + } + + delta = (numpmid + numextra) * sizeof(pmID) - + shm->segment[SEG_SCRATCH].elt_size * shm->segment[SEG_SCRATCH].nelt; if (delta > 0) { memcpy(new_hdr, shm, hdr_size); - new_hdr->segment[SEG_SCRATCH].nelt = numpmid * sizeof(pmID); + new_hdr->segment[SEG_SCRATCH].nelt = (numpmid+numextra) * sizeof(pmID); new_hdr->size += delta; shm_reshape(new_hdr); } dst = (pmID *)&((char *)shm)[shm->segment[SEG_SCRATCH].base]; + if (numextra) + memcpy(dst + numpmid, extra, numextra * sizeof(pmID)); memcpy(dst, pmidlist, numpmid * sizeof(pmID)); numatoms = 0; - fprintf(send_f, "prefetch %d\n", numpmid); + fprintf(send_f, "prefetch %d\n", numpmid + numextra); fflush(send_f); if (fgets(response, sizeof(response), recv_f) == NULL) { fprintf(stderr, "prefetch: recv EOF: %s\n", strerror(errno)); @@ -602,7 +620,7 @@ fetch(int numpmid, pmID pmidlist[], pmRe static int fetch_callback(pmdaMetric *mdesc, unsigned int inst, pmAtomValue *atom) { - int i; + int i, count; shm_result_t *rtab; pmAtomValue myatom; __pmID_int *pmidp; @@ -693,10 +711,55 @@ fetch_callback(pmdaMetric *mdesc, unsign if (numatoms <= 0) return numatoms; + rtab = (shm_result_t *)&((char *)shm)[shm->segment[SEG_SCRATCH].base]; + + /* + * special case the filesystem metrics at this point - + * mapping the PDH services semantics for these to the + * saner metrics from other platforms is not pretty... + */ + if ((pmidp->cluster == 0) && + (pmidp->item == 67 || (pmidp->item >= 117 && pmidp->item <= 119))) { + float used_space, free_space, free_percent; + unsigned long long used, avail, capacity; + int item; + + for (count = 0, i = 0; i < numatoms; i++) { + if (rtab[i].r_inst != inst) + continue; + if (pmidp->item == 67) { /* filesys.full, rtab holds %Free */ + atom->f = (1.0 - rtab[i].r_atom.f) * 100.0; + return 1; + } + item = ((__pmID_int*)&rtab[i].r_pmid)->item; + if (item == 120) { /* dummy metric, rtab holds FreeMB */ + free_space = ((float)rtab[i].r_atom.ul); + count++; + } else if (item == 121) { /* dummy metric, rtab holds %Free */ + free_percent = rtab[i].r_atom.f; + count++; + } + } + if (count != 2) /* we need both "dummy" metric values below */ + return 0; + + used_space = free_space * (1.0 - free_percent); + used = 1024 * (unsigned long long)used_space; /* MB to KB */ + avail = 1024 * (unsigned long long)free_space; /* MB to KB */ + capacity = used + avail; + + if (pmidp->item == 117) /* filesys.capacity */ + atom->ull = capacity; + else if (pmidp->item == 118) /* filesys.used */ + atom->ull = used; + else if (pmidp->item == 119) /* filesys.free */ + atom->ull = avail; + return 1; + } + /* * search in shm for pmAtomValues previously deposited by prefetch */ - rtab = (shm_result_t *)&((char *)shm)[shm->segment[SEG_SCRATCH].base]; for (i = 0; i < numatoms; i++) { if (rtab[i].r_pmid == mdesc->m_desc.pmid && rtab[i].r_inst == inst) { *atom = rtab[i].r_atom; Index: devel-pcp-2.5.99/src/pmdas/windows/pmns.filesys =================================================================== --- devel-pcp-2.5.99.orig/src/pmdas/windows/pmns.filesys 2006-11-20 12:10:17.488524250 +1100 +++ devel-pcp-2.5.99/src/pmdas/windows/pmns.filesys 2006-11-20 12:11:07.567654000 +1100 @@ -1,4 +1,7 @@ filesys { full WINDOWS:0:67 + capacity WINDOWS:0:117 + used WINDOWS:0:118 + free WINDOWS:0:119 } From nscott@aconex.com Sun Nov 19 21:39:11 2006 Received: with ECARTIS (v1.0.0; list pcp); Sun, 19 Nov 2006 21:39:26 -0800 (PST) Received: from prod.aconex.com (mail.app.aconex.com [203.89.192.138]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id kAK5dAaG020683 for ; Sun, 19 Nov 2006 21:39:10 -0800 Received: from page.mel.office.aconex.com (unknown [192.168.0.210]) by prod.aconex.com (Postfix) with ESMTP id CE00028C72; Mon, 20 Nov 2006 16:08:02 +1100 (EST) Received: from localhost (page.mel.aconex.com [127.0.0.1]) by page.mel.office.aconex.com (Postfix) with ESMTP id B62405340F5; Mon, 20 Nov 2006 16:08:02 +1100 (EST) Received: from page.mel.office.aconex.com ([127.0.0.1]) by localhost (mail.aconex.com [127.0.0.1]) (amavisd-new, port 10024) with LMTP id 28591-01-14; Mon, 20 Nov 2006 16:07:59 +1100 (EST) Received: from edge (unknown [192.168.0.246]) by page.mel.office.aconex.com (Postfix) with ESMTP id 3A5815340F3; Mon, 20 Nov 2006 16:07:59 +1100 (EST) Subject: [PATCH 01/12] Fix Windows PMDA build From: Nathan Scott Reply-To: nscott@aconex.com To: Michael Newton , Mark Goodwin Cc: pcp@oss.sgi.com Content-Type: text/plain Organization: Aconex Date: Mon, 20 Nov 2006 16:08:34 +1100 Message-Id: <1163999314.4695.231.camel@edge> Mime-Version: 1.0 X-Mailer: Evolution 2.6.3 Content-Transfer-Encoding: 7bit X-archive-position: 579 X-ecartis-version: Ecartis v1.0.0 Sender: pcp-bounce@oss.sgi.com Errors-to: pcp-bounce@oss.sgi.com X-original-sender: nscott@aconex.com Precedence: bulk X-list: pcp Currently the Makefile for the Windows agent is in a half-broken state, it assumes in some places that uuencoding has been done on certain files where it has not (uuencode/uudecode doesn't seem to be part of a default Cygwin install either). This reverts the Windows PMDA Makefile to its earlier, working state. -- Nathan Index: devel-pcp-2.5.99/src/pmdas/windows/GNUmakefile =================================================================== --- devel-pcp-2.5.99.orig/src/pmdas/windows/GNUmakefile 2006-11-20 15:01:56.636180750 +1100 +++ devel-pcp-2.5.99/src/pmdas/windows/GNUmakefile 2006-11-20 15:02:11.969139000 +1100 @@ -37,8 +37,8 @@ PMNS = pmns.disk pmns.kernel pmns.mem p pmns.sqlserver pmns.filesys pmns.hinv LSRCFILES = $(SHIM_CFILES) \ Install Remove $(PMNS) root README \ - GNUmakefile.install shim.save.uu \ - match-counters show-all-ctrs.c show-all-ctrs.save.uu \ + GNUmakefile.install shim.save \ + match-counters show-all-ctrs.c show-all-ctrs.save \ all-on-tower PMDADIR = $(PCP_PMDAS_DIR)/$(IAM) DDKROOT = C:\WINDDK\3790 @@ -147,7 +147,7 @@ foo: @echo do_build=$(do_build) @echo LSRCFILES=$(LSRCFILES) @echo CMDTARGET=$(CMDTARGET) - @echo SHIMTARGET=$(SHIMTARGET) + @echo SHIMTARGET=$(SHIMTARGET) show-all-ctrs.exe: show-all-ctrs.obj pdherr.obj $(SHIM_LINK) /out:show-all-ctrs.exe show-all-ctrs.obj pdherr.obj pdh.lib advapi32.lib $(SHIM_LINK_FLAGS) From nscott@aconex.com Sun Nov 19 21:39:15 2006 Received: with ECARTIS (v1.0.0; list pcp); Sun, 19 Nov 2006 21:39:30 -0800 (PST) Received: from prod.aconex.com (mail.app.aconex.com [203.89.192.138]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id kAK5dDaG020761 for ; Sun, 19 Nov 2006 21:39:14 -0800 Received: from page.mel.office.aconex.com (unknown [192.168.0.210]) by prod.aconex.com (Postfix) with ESMTP id 08A4C28B64; Mon, 20 Nov 2006 16:09:04 +1100 (EST) Received: from localhost (page.mel.aconex.com [127.0.0.1]) by page.mel.office.aconex.com (Postfix) with ESMTP id E146F53403A; Mon, 20 Nov 2006 16:09:03 +1100 (EST) Received: from page.mel.office.aconex.com ([127.0.0.1]) by localhost (mail.aconex.com [127.0.0.1]) (amavisd-new, port 10024) with LMTP id 28395-01-17; Mon, 20 Nov 2006 16:09:03 +1100 (EST) Received: from edge (unknown [192.168.0.246]) by page.mel.office.aconex.com (Postfix) with ESMTP id 21C9A534039; Mon, 20 Nov 2006 16:09:03 +1100 (EST) Subject: [PATCH 05/12] Fix pmval -f typo From: Nathan Scott Reply-To: nscott@aconex.com To: Michael Newton , Mark Goodwin Cc: pcp@oss.sgi.com Content-Type: text/plain Organization: Aconex Date: Mon, 20 Nov 2006 16:09:40 +1100 Message-Id: <1163999381.4695.236.camel@edge> Mime-Version: 1.0 X-Mailer: Evolution 2.6.3 Content-Transfer-Encoding: 7bit X-archive-position: 584 X-ecartis-version: Ecartis v1.0.0 Sender: pcp-bounce@oss.sgi.com Errors-to: pcp-bounce@oss.sgi.com X-original-sender: nscott@aconex.com Precedence: bulk X-list: pcp Cut and paste error in the diagnostics for pmval's -f option. -- Nathan Index: devel-pcp-2.5.99/src/pmval/pmval.c =================================================================== --- devel-pcp-2.5.99.orig/src/pmval/pmval.c 2006-11-20 11:15:25.102763000 +1100 +++ devel-pcp-2.5.99/src/pmval/pmval.c 2006-11-20 11:17:07.785180250 +1100 @@ -1072,7 +1072,7 @@ getargs(int argc, /* in - command line case 'f': /* fixed format count */ d = (int)strtol(optarg, &endnum, 10); if (*endnum != '\0' || d < 0) { - fprintf(stderr, "%s: -s requires +ve numeric argument\n", pmProgname); + fprintf(stderr, "%s: -f requires +ve numeric argument\n", pmProgname); errflag++; } fixed = d; From nscott@aconex.com Sun Nov 19 21:39:09 2006 Received: with ECARTIS (v1.0.0; list pcp); Sun, 19 Nov 2006 21:39:25 -0800 (PST) Received: from prod.aconex.com (mail.app.aconex.com [203.89.192.138]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id kAK5d8aG020653 for ; Sun, 19 Nov 2006 21:39:09 -0800 Received: from page.mel.office.aconex.com (unknown [192.168.0.210]) by prod.aconex.com (Postfix) with ESMTP id 5C40028C5D; Mon, 20 Nov 2006 16:10:35 +1100 (EST) Received: from localhost (page.mel.aconex.com [127.0.0.1]) by page.mel.office.aconex.com (Postfix) with ESMTP id 36BEA5340F6; Mon, 20 Nov 2006 16:10:35 +1100 (EST) Received: from page.mel.office.aconex.com ([127.0.0.1]) by localhost (mail.aconex.com [127.0.0.1]) (amavisd-new, port 10024) with LMTP id 28395-01-26; Mon, 20 Nov 2006 16:10:34 +1100 (EST) Received: from edge (unknown [192.168.0.246]) by page.mel.office.aconex.com (Postfix) with ESMTP id 1B2A75340F1; Mon, 20 Nov 2006 16:10:34 +1100 (EST) Subject: [PATCH 10/12] Fix typos on the pmie(1) man page From: Nathan Scott Reply-To: nscott@aconex.com To: Michael Newton , Mark Goodwin Cc: pcp@oss.sgi.com Content-Type: text/plain Organization: Aconex Date: Mon, 20 Nov 2006 16:11:11 +1100 Message-Id: <1163999471.4695.242.camel@edge> Mime-Version: 1.0 X-Mailer: Evolution 2.6.3 Content-Transfer-Encoding: 7bit X-archive-position: 578 X-ecartis-version: Ecartis v1.0.0 Sender: pcp-bounce@oss.sgi.com Errors-to: pcp-bounce@oss.sgi.com X-original-sender: nscott@aconex.com Precedence: bulk X-list: pcp Fix a couple of typos in pmie(1). -- Nathan Index: devel-pcp-2.5.99/man/man1/pmie.1 =================================================================== --- devel-pcp-2.5.99.orig/man/man1/pmie.1 2006-11-20 12:18:15.262383250 +1100 +++ devel-pcp-2.5.99/man/man1/pmie.1 2006-11-20 12:19:08.653720000 +1100 @@ -619,8 +619,8 @@ True if at least \fIN\fP percent of set T} .TE .P -The following instantial operators may be used filter or limit a -a set-valued logical expression, based on regular expression matching +The following instantial operators may be used to filter or limit a +set-valued logical expression, based on regular expression matching of instance names. The logical expression must be a set involving the dimension of instances, and the regular expression is of the form used by From nscott@aconex.com Sun Nov 19 21:39:11 2006 Received: with ECARTIS (v1.0.0; list pcp); Sun, 19 Nov 2006 21:39:37 -0800 (PST) Received: from prod.aconex.com (mail.app.aconex.com [203.89.192.138]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id kAK5dAaG020681 for ; Sun, 19 Nov 2006 21:39:10 -0800 Received: from page.mel.office.aconex.com (unknown [192.168.0.210]) by prod.aconex.com (Postfix) with ESMTP id CF1A928CA3; Mon, 20 Nov 2006 16:10:41 +1100 (EST) Received: from localhost (page.mel.aconex.com [127.0.0.1]) by page.mel.office.aconex.com (Postfix) with ESMTP id B320D5340F3; Mon, 20 Nov 2006 16:10:41 +1100 (EST) Received: from page.mel.office.aconex.com ([127.0.0.1]) by localhost (mail.aconex.com [127.0.0.1]) (amavisd-new, port 10024) with LMTP id 29113-01-21; Mon, 20 Nov 2006 16:10:41 +1100 (EST) Received: from edge (unknown [192.168.0.246]) by page.mel.office.aconex.com (Postfix) with ESMTP id EB56F534039; Mon, 20 Nov 2006 16:10:40 +1100 (EST) Subject: [PATCH 11/12] Fix pmparsetime(3) man page typo From: Nathan Scott Reply-To: nscott@aconex.com To: Michael Newton , Mark Goodwin Cc: pcp@oss.sgi.com Content-Type: text/plain Organization: Aconex Date: Mon, 20 Nov 2006 16:11:18 +1100 Message-Id: <1163999478.4695.243.camel@edge> Mime-Version: 1.0 X-Mailer: Evolution 2.6.3 Content-Transfer-Encoding: 7bit X-archive-position: 588 X-ecartis-version: Ecartis v1.0.0 Sender: pcp-bounce@oss.sgi.com Errors-to: pcp-bounce@oss.sgi.com X-original-sender: nscott@aconex.com Precedence: bulk X-list: pcp Fixes a pmparsetime(3) typo. -- Nathan Index: devel-pcp-2.5.99/man/man3/pmparsetime.3 =================================================================== --- devel-pcp-2.5.99.orig/man/man3/pmparsetime.3 2006-11-20 12:20:32.706973000 +1100 +++ devel-pcp-2.5.99/man/man3/pmparsetime.3 2006-11-20 12:21:01.092747000 +1100 @@ -68,8 +68,8 @@ for example), while should have its tv_sec component set to INT_MAX. .P The -.BR rslt , -structures must be allocated before calling +.B rslt +structure must be allocated before calling .BR __pmParseTime . .P You also need to set the current PCP reporting time zone to correctly From nscott@aconex.com Sun Nov 19 21:39:10 2006 Received: with ECARTIS (v1.0.0; list pcp); Sun, 19 Nov 2006 21:39:39 -0800 (PST) Received: from prod.aconex.com (mail.app.aconex.com [203.89.192.138]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id kAK5d8aG020657 for ; Sun, 19 Nov 2006 21:39:09 -0800 Received: from page.mel.office.aconex.com (unknown [192.168.0.210]) by prod.aconex.com (Postfix) with ESMTP id 9229628CAA; Mon, 20 Nov 2006 16:08:39 +1100 (EST) Received: from localhost (page.mel.aconex.com [127.0.0.1]) by page.mel.office.aconex.com (Postfix) with ESMTP id 6BA435340F3; Mon, 20 Nov 2006 16:08:39 +1100 (EST) Received: from page.mel.office.aconex.com ([127.0.0.1]) by localhost (mail.aconex.com [127.0.0.1]) (amavisd-new, port 10024) with LMTP id 28395-01-14; Mon, 20 Nov 2006 16:08:38 +1100 (EST) Received: from edge (unknown [192.168.0.246]) by page.mel.office.aconex.com (Postfix) with ESMTP id A3201534039; Mon, 20 Nov 2006 16:08:38 +1100 (EST) Subject: [PATCH 02/12] Fix Windows libpcp_pmda build From: Nathan Scott Reply-To: nscott@aconex.com To: Michael Newton , Mark Goodwin Cc: pcp@oss.sgi.com Content-Type: text/plain Organization: Aconex Date: Mon, 20 Nov 2006 16:09:16 +1100 Message-Id: <1163999356.4695.233.camel@edge> Mime-Version: 1.0 X-Mailer: Evolution 2.6.3 Content-Transfer-Encoding: 7bit X-archive-position: 589 X-ecartis-version: Ecartis v1.0.0 Sender: pcp-bounce@oss.sgi.com Errors-to: pcp-bounce@oss.sgi.com X-original-sender: nscott@aconex.com Precedence: bulk X-list: pcp Currently the libpcp_pmda library Makefile does not provide any valid default target on Windows. This fixes that (build tested on Linux and on Windows). -- Nathan Index: devel-pcp-2.5.99/src/libpcp_pmda/src/GNUmakefile =================================================================== --- devel-pcp-2.5.99.orig/src/libpcp_pmda/src/GNUmakefile 2006-11-20 14:33:58.379296250 +1100 +++ devel-pcp-2.5.99/src/libpcp_pmda/src/GNUmakefile 2006-11-20 14:43:41.539741500 +1100 @@ -56,7 +56,7 @@ LLDLIBS = -lpcp LDIRT = $(LIBTARGET_V1) $(LIBTARGET_V2) $(LIBTARGET_V3) -default: $(LIBTARGET_V1) $(LIBTARGET_V2) +default: $(LIBTARGET_V1) $(LIBTARGET_V2) $(LIBTARGET) $(LIBTARGET_V1) $(LIBTARGET_V2): $(LIBTARGET_V3) $(LN_S) -f $(LIBTARGET_V3) $(LIBTARGET_V2) From nscott@aconex.com Sun Nov 19 21:39:11 2006 Received: with ECARTIS (v1.0.0; list pcp); Sun, 19 Nov 2006 21:39:33 -0800 (PST) Received: from prod.aconex.com (mail.app.aconex.com [203.89.192.138]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id kAK5dAaG020686 for ; Sun, 19 Nov 2006 21:39:10 -0800 Received: from page.mel.office.aconex.com (unknown [192.168.0.210]) by prod.aconex.com (Postfix) with ESMTP id B0EF328997; Mon, 20 Nov 2006 16:07:10 +1100 (EST) Received: from localhost (page.mel.aconex.com [127.0.0.1]) by page.mel.office.aconex.com (Postfix) with ESMTP id 8DA51534039; Mon, 20 Nov 2006 16:07:10 +1100 (EST) Received: from page.mel.office.aconex.com ([127.0.0.1]) by localhost (mail.aconex.com [127.0.0.1]) (amavisd-new, port 10024) with LMTP id 28591-01-9; Mon, 20 Nov 2006 16:07:09 +1100 (EST) Received: from edge (unknown [192.168.0.246]) by page.mel.office.aconex.com (Postfix) with ESMTP id 1FA3153403A; Mon, 20 Nov 2006 16:07:09 +1100 (EST) Subject: [PATCH 00/12] Series of small PCP fixups From: Nathan Scott Reply-To: nscott@aconex.com To: Michael Newton , Mark Goodwin Cc: pcp@oss.sgi.com, kmcdonell@aconex.com Content-Type: text/plain Organization: Aconex Date: Mon, 20 Nov 2006 16:07:46 +1100 Message-Id: <1163999266.4695.228.camel@edge> Mime-Version: 1.0 X-Mailer: Evolution 2.6.3 Content-Transfer-Encoding: 7bit X-archive-position: 586 X-ecartis-version: Ecartis v1.0.0 Sender: pcp-bounce@oss.sgi.com Errors-to: pcp-bounce@oss.sgi.com X-original-sender: nscott@aconex.com Precedence: bulk X-list: pcp Hi guys, Here comes a series of PCP fixes from all over the shop, all are small things I've found in using PCP for the last month or so. This series will apply cleanly to the pcp-2.5.99 tarball which Max left here: ftp://oss.sgi.com/projects/pcp/download/dev/pcp-2.5.99-20060717.src.tar.gz Please apply (and please lemme know when a 2.6.0 release is planned if you can, and if that boat has not already sailed - thanks!). cheers. -- Nathan From nscott@aconex.com Sun Nov 19 21:39:12 2006 Received: with ECARTIS (v1.0.0; list pcp); Sun, 19 Nov 2006 21:39:37 -0800 (PST) Received: from prod.aconex.com (mail.app.aconex.com [203.89.192.138]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id kAK5dAaG020688 for ; Sun, 19 Nov 2006 21:39:11 -0800 Received: from page.mel.office.aconex.com (unknown [192.168.0.210]) by prod.aconex.com (Postfix) with ESMTP id DE1C128C5F; Mon, 20 Nov 2006 16:09:22 +1100 (EST) Received: from localhost (page.mel.aconex.com [127.0.0.1]) by page.mel.office.aconex.com (Postfix) with ESMTP id 265B95340F1; Mon, 20 Nov 2006 16:09:20 +1100 (EST) Received: from page.mel.office.aconex.com ([127.0.0.1]) by localhost (mail.aconex.com [127.0.0.1]) (amavisd-new, port 10024) with LMTP id 29113-01-11; Mon, 20 Nov 2006 16:09:17 +1100 (EST) Received: from edge (unknown [192.168.0.246]) by page.mel.office.aconex.com (Postfix) with ESMTP id 0E1645340F5; Mon, 20 Nov 2006 16:09:17 +1100 (EST) Subject: [PATCH 06/12] Fix process priority metric From: Nathan Scott Reply-To: nscott@aconex.com To: Michael Newton , Mark Goodwin Cc: pcp@oss.sgi.com Content-Type: text/plain Organization: Aconex Date: Mon, 20 Nov 2006 16:09:54 +1100 Message-Id: <1163999394.4695.237.camel@edge> Mime-Version: 1.0 X-Mailer: Evolution 2.6.3 Content-Transfer-Encoding: 7bit X-archive-position: 587 X-ecartis-version: Ecartis v1.0.0 Sender: pcp-bounce@oss.sgi.com Errors-to: pcp-bounce@oss.sgi.com X-original-sender: nscott@aconex.com Precedence: bulk X-list: pcp The units for a processes priority are not kilobytes, looks like another cut and paste typo. -- Nathan Index: devel-pcp-2.5.99/src/pmdas/linux/pmda.c =================================================================== --- devel-pcp-2.5.99.orig/src/pmdas/linux/pmda.c 2006-11-20 11:28:18.075070750 +1100 +++ devel-pcp-2.5.99/src/pmdas/linux/pmda.c 2006-11-20 11:28:23.535412000 +1100 @@ -1183,7 +1183,7 @@ static pmdaMetric metrictab[] = { /* proc.psinfo.priority */ { NULL, { PMDA_PMID(CLUSTER_PID_STAT,17), PM_TYPE_U32, PROC_INDOM, PM_SEM_DISCRETE, - PMDA_PMUNITS(1,0,0,PM_SPACE_KBYTE,0,0) } }, + PMDA_PMUNITS(0,0,0,0,0,0) } }, /* proc.psinfo.nice */ { NULL, From nscott@aconex.com Mon Nov 20 23:02:09 2006 Received: with ECARTIS (v1.0.0; list pcp); Mon, 20 Nov 2006 23:02:16 -0800 (PST) Received: from page.mel.office.aconex.com (mail.aconex.com [150.101.159.26]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id kAL726aG003008 for ; Mon, 20 Nov 2006 23:02:08 -0800 Received: from localhost (page.mel.aconex.com [127.0.0.1]) by page.mel.office.aconex.com (Postfix) with ESMTP id 3EAED5340F5; Tue, 21 Nov 2006 17:31:48 +1100 (EST) Received: from page.mel.office.aconex.com ([127.0.0.1]) by localhost (mail.aconex.com [127.0.0.1]) (amavisd-new, port 10024) with LMTP id 29834-01-35; Tue, 21 Nov 2006 17:31:42 +1100 (EST) Received: from edge (unknown [192.168.0.246]) by page.mel.office.aconex.com (Postfix) with ESMTP id 74EC15340F7; Tue, 21 Nov 2006 17:31:36 +1100 (EST) Subject: [PATCH] Fix Linux PMDA CPU time metrics From: Nathan Scott Reply-To: nscott@aconex.com To: Michael Newton , Mark Goodwin Cc: pcp@oss.sgi.com Content-Type: multipart/mixed; boundary="=-IcbJvWTwD/RGu2WiCA1n" Organization: Aconex Date: Tue, 21 Nov 2006 17:32:19 +1100 Message-Id: <1164090739.4695.301.camel@edge> Mime-Version: 1.0 X-Mailer: Evolution 2.6.3 X-archive-position: 590 X-ecartis-version: Ecartis v1.0.0 Sender: pcp-bounce@oss.sgi.com Errors-to: pcp-bounce@oss.sgi.com X-original-sender: nscott@aconex.com Precedence: bulk X-list: pcp --=-IcbJvWTwD/RGu2WiCA1n Content-Type: text/plain Content-Transfer-Encoding: 7bit Hi guys, This patch fixes the precision of the allcpu and percpu millisecond counters. This has been the source of problems in the past, and it looks like we missed another size transition awhile back in 2.6.5+. Historically, 2.4 and earlier kernels used a funky int/long mix for idle and other CPU times. Some code was put in to try to circumvent counter wrapping, in the agent, back then (hohum). Then, in 2.6.0, though 2.6.4 (inclusive) all CPU time counters were made 32 bits for all platforms (argh!). Then, in 2.6.5 (and beyond) _all_ CPU time counters got changed to be 64 bits unconditionally. At least someone finally got it right. :] However, we missed this last transition to 64 bits, and the Linux agent hasn't been updated. This patch does that, and dynamically sets the type of the CPU time metrics depending on the 3 kernel versions/flavours. In the process of fixing this, it became clear that the wrap handling was going to be extremely hard to get right for all cases (it is wrong now, after the kernel type changes a few years back), so I removed it completely. I don't think this will affect anyone in practice. I noticed the context switch count and interrupt count are also not being exported correctly, so I fixed those up at the same time (these ones seem to be always 32 bits on 2.4 and always 64 bits on 2.6). Finally, theres a new CPU time being accounted in recent 2.6 kernels ("steal") - I've not updated the agent to export that as yet (it is always zero on my boxen). cheers. -- Nathan --=-IcbJvWTwD/RGu2WiCA1n Content-Disposition: attachment; filename=fix-linux-percpu-metrics Content-Type: text/x-patch; name=fix-linux-percpu-metrics; charset=UTF-8 Content-Transfer-Encoding: 7bit Index: devel-pcp-2.5.99/src/pmdas/linux/pmda.c =================================================================== --- devel-pcp-2.5.99.orig/src/pmdas/linux/pmda.c 2006-11-21 11:56:25.496190500 +1100 +++ devel-pcp-2.5.99/src/pmdas/linux/pmda.c 2006-11-21 17:05:16.378298250 +1100 @@ -80,12 +80,26 @@ */ #if defined(HAVE_64BIT_LONG) #define KERNEL_ULONG PM_TYPE_U64 -#define _pm_assign_ulong(atomp, val) (atomp)->ull = val +#define _pm_assign_ulong(atomp, val) do { (atomp)->ull = (val); } while (0) #else #define KERNEL_ULONG PM_TYPE_U32 -#define _pm_assign_ulong(atomp, val) (atomp)->ul = val +#define _pm_assign_ulong(atomp, val) do { (atomp)->ul = (val); } while (0) #endif +/* + * Some metrics need to have their type set at runtime, based on the + * running kernel version (not simply a 64 vs 32 bit machine issue). + */ +#define KERNEL_UTYPE PM_TYPE_NOSUPPORT /* set to real type at runtime */ +#define _pm_metric_type(type, size) \ + do { \ + (type) = ((size)==8 ? PM_TYPE_U64 : PM_TYPE_U32); \ + } while (0) +#define _pm_assign_utype(size, atomp, val) \ + do { \ + if ((size)==8) { (atomp)->ull = (val); } else { (atomp)->ul = (val); } \ + } while (0) + static proc_stat_t proc_stat; static proc_meminfo_t proc_meminfo; static proc_loadavg_t proc_loadavg; @@ -117,6 +131,10 @@ static int _isDSO = 1; /* =0 I am a dae /* globals */ size_t _pm_system_pagesize; /* for hinv.pagesize and used elsewhere */ int _pm_have_proc_vmstat; /* if /proc/vmstat is available */ +int _pm_intr_size; /* size in bytes of interrupt sum count metric */ +int _pm_ctxt_size; /* size in bytes of context switch count metric */ +int _pm_cputime_size; /* size in bytes of most of the cputime metrics */ +int _pm_idletime_size; /* size in bytes of the idle cputime metric */ /* * Metric Instance Domains (statically initialized ones only) @@ -201,22 +219,22 @@ static pmdaMetric metrictab[] = { /* kernel.percpu.cpu.user */ { NULL, - { PMDA_PMID(CLUSTER_STAT,0), PM_TYPE_U32, CPU_INDOM, PM_SEM_COUNTER, + { PMDA_PMID(CLUSTER_STAT,0), KERNEL_UTYPE, CPU_INDOM, PM_SEM_COUNTER, PMDA_PMUNITS(0,1,0,0,PM_TIME_MSEC,0) }, }, /* kernel.percpu.cpu.nice */ { NULL, - { PMDA_PMID(CLUSTER_STAT,1), PM_TYPE_U32, CPU_INDOM, PM_SEM_COUNTER, + { PMDA_PMID(CLUSTER_STAT,1), KERNEL_UTYPE, CPU_INDOM, PM_SEM_COUNTER, PMDA_PMUNITS(0,1,0,0,PM_TIME_MSEC,0) }, }, /* kernel.percpu.cpu.sys */ { NULL, - { PMDA_PMID(CLUSTER_STAT,2), PM_TYPE_U32, CPU_INDOM, PM_SEM_COUNTER, + { PMDA_PMID(CLUSTER_STAT,2), KERNEL_UTYPE, CPU_INDOM, PM_SEM_COUNTER, PMDA_PMUNITS(0,1,0,0,PM_TIME_MSEC,0) }, }, /* kernel.percpu.cpu.idle */ { NULL, - { PMDA_PMID(CLUSTER_STAT,3), KERNEL_ULONG, CPU_INDOM, PM_SEM_COUNTER, + { PMDA_PMID(CLUSTER_STAT,3), KERNEL_UTYPE, CPU_INDOM, PM_SEM_COUNTER, PMDA_PMUNITS(0,1,0,0,PM_TIME_MSEC,0) }, }, /* disk.dev.read */ @@ -301,12 +319,12 @@ static pmdaMetric metrictab[] = { /* kernel.all.intr */ { NULL, - { PMDA_PMID(CLUSTER_STAT,12), PM_TYPE_U32, PM_INDOM_NULL, PM_SEM_COUNTER, + { PMDA_PMID(CLUSTER_STAT,12), KERNEL_UTYPE, PM_INDOM_NULL, PM_SEM_COUNTER, PMDA_PMUNITS(0,0,1,0,0,PM_COUNT_ONE) }, }, /* kernel.all.pswitch */ { NULL, - { PMDA_PMID(CLUSTER_STAT,13), KERNEL_ULONG, PM_INDOM_NULL, PM_SEM_COUNTER, + { PMDA_PMID(CLUSTER_STAT,13), KERNEL_UTYPE, PM_INDOM_NULL, PM_SEM_COUNTER, PMDA_PMUNITS(0,0,1,0,0,PM_COUNT_ONE) }, }, /* kernel.all.sysfork */ @@ -317,22 +335,22 @@ static pmdaMetric metrictab[] = { /* kernel.all.cpu.user */ { NULL, - { PMDA_PMID(CLUSTER_STAT,20), PM_TYPE_U32, PM_INDOM_NULL, PM_SEM_COUNTER, + { PMDA_PMID(CLUSTER_STAT,20), KERNEL_UTYPE, PM_INDOM_NULL, PM_SEM_COUNTER, PMDA_PMUNITS(0,1,0,0,PM_TIME_MSEC,0) }, }, /* kernel.all.cpu.nice */ { NULL, - { PMDA_PMID(CLUSTER_STAT,21), PM_TYPE_U32, PM_INDOM_NULL, PM_SEM_COUNTER, + { PMDA_PMID(CLUSTER_STAT,21), KERNEL_UTYPE, PM_INDOM_NULL, PM_SEM_COUNTER, PMDA_PMUNITS(0,1,0,0,PM_TIME_MSEC,0) }, }, /* kernel.all.cpu.sys */ { NULL, - { PMDA_PMID(CLUSTER_STAT,22), PM_TYPE_U32, PM_INDOM_NULL, PM_SEM_COUNTER, + { PMDA_PMID(CLUSTER_STAT,22), KERNEL_UTYPE, PM_INDOM_NULL, PM_SEM_COUNTER, PMDA_PMUNITS(0,1,0,0,PM_TIME_MSEC,0) }, }, /* kernel.all.cpu.idle */ { NULL, - { PMDA_PMID(CLUSTER_STAT,23), KERNEL_ULONG, PM_INDOM_NULL, PM_SEM_COUNTER, + { PMDA_PMID(CLUSTER_STAT,23), KERNEL_UTYPE, PM_INDOM_NULL, PM_SEM_COUNTER, PMDA_PMUNITS(0,1,0,0,PM_TIME_MSEC,0) }, }, /* disk.all.read */ @@ -377,12 +395,12 @@ static pmdaMetric metrictab[] = { /* kernel.percpu.cpu.wait.total */ { NULL, - { PMDA_PMID(CLUSTER_STAT,30), PM_TYPE_U32, CPU_INDOM, PM_SEM_COUNTER, + { PMDA_PMID(CLUSTER_STAT,30), KERNEL_UTYPE, CPU_INDOM, PM_SEM_COUNTER, PMDA_PMUNITS(0,1,0,0,PM_TIME_MSEC,0) }, }, /* kernel.percpu.cpu.intr */ { NULL, - { PMDA_PMID(CLUSTER_STAT,31), PM_TYPE_U32, CPU_INDOM, PM_SEM_COUNTER, + { PMDA_PMID(CLUSTER_STAT,31), KERNEL_UTYPE, CPU_INDOM, PM_SEM_COUNTER, PMDA_PMUNITS(0,1,0,0,PM_TIME_MSEC,0) }, }, /* hinv.ncpu */ @@ -397,12 +415,12 @@ static pmdaMetric metrictab[] = { /* kernel.all.cpu.intr */ { NULL, - { PMDA_PMID(CLUSTER_STAT,34), PM_TYPE_U32, PM_INDOM_NULL, PM_SEM_COUNTER, + { PMDA_PMID(CLUSTER_STAT,34), KERNEL_UTYPE, PM_INDOM_NULL, PM_SEM_COUNTER, PMDA_PMUNITS(0,1,0,0,PM_TIME_MSEC,0) }, }, /* kernel.all.cpu.wait.total */ { NULL, - { PMDA_PMID(CLUSTER_STAT,35), PM_TYPE_U32, PM_INDOM_NULL, PM_SEM_COUNTER, + { PMDA_PMID(CLUSTER_STAT,35), KERNEL_UTYPE, PM_INDOM_NULL, PM_SEM_COUNTER, PMDA_PMUNITS(0,1,0,0,PM_TIME_MSEC,0) }, }, /* kernel.all.hz */ @@ -3067,16 +3085,20 @@ linux_fetchCallBack(pmdaMetric *mdesc, u */ switch (idp->item) { case 0: /* user */ - atom->ul = 1000 * (double)proc_stat.p_user[inst] / proc_stat.hz; + _pm_assign_utype(_pm_cputime_size, atom, + 1000 * (double)proc_stat.p_user[inst] / proc_stat.hz); break; case 1: /* nice */ - atom->ul = 1000 * (double)proc_stat.p_nice[inst] / proc_stat.hz; + _pm_assign_utype(_pm_cputime_size, atom, + 1000 * (double)proc_stat.p_nice[inst] / proc_stat.hz); break; case 2: /* sys */ - atom->ul = 1000 * (double)proc_stat.p_sys[inst] / proc_stat.hz; + _pm_assign_utype(_pm_cputime_size, atom, + 1000 * (double)proc_stat.p_sys[inst] / proc_stat.hz); break; case 3: /* idle */ - _pm_assign_ulong(atom, 1000 * (double)proc_stat.p_idle[inst] / proc_stat.hz); + _pm_assign_utype(_pm_idletime_size, atom, + 1000 * (double)proc_stat.p_idle[inst] / proc_stat.hz); break; case 8: /* pagesin */ @@ -3104,10 +3126,10 @@ linux_fetchCallBack(pmdaMetric *mdesc, u atom->ul = proc_stat.page[1]; break; case 12: /* intr */ - atom->ul = proc_stat.intr[0]; + _pm_assign_utype(_pm_intr_size, atom, proc_stat.intr); break; case 13: /* ctxt */ - _pm_assign_ulong(atom, proc_stat.ctxt); + _pm_assign_utype(_pm_ctxt_size, atom, proc_stat.intr); break; case 14: /* processes */ _pm_assign_ulong(atom, proc_stat.processes); @@ -3115,31 +3137,39 @@ linux_fetchCallBack(pmdaMetric *mdesc, u /* gilly - change the calculation to prevent a bug */ case 20: /* all.user */ - atom->ul = 1000 * (double)proc_stat.user / proc_stat.hz; + _pm_assign_utype(_pm_cputime_size, atom, + 1000 * (double)proc_stat.user / proc_stat.hz); break; case 21: /* all.nice */ - atom->ul = 1000 * (double)proc_stat.nice / proc_stat.hz; + _pm_assign_utype(_pm_cputime_size, atom, + 1000 * (double)proc_stat.nice / proc_stat.hz); break; case 22: /* all.sys */ - atom->ul = 1000 * (double)proc_stat.sys / proc_stat.hz; + _pm_assign_utype(_pm_cputime_size, atom, + 1000 * (double)proc_stat.sys / proc_stat.hz); break; case 23: /* all.idle */ - _pm_assign_ulong(atom, 1000 * (double)proc_stat.idle / proc_stat.hz); + _pm_assign_utype(_pm_idletime_size, atom, + 1000 * (double)proc_stat.idle / proc_stat.hz); break; case 30: /* kernel.percpu.cpu.wait.total */ - atom->ul = 1000 * (double)proc_stat.p_wait[inst] / proc_stat.hz; + _pm_assign_utype(_pm_cputime_size, atom, + 1000 * (double)proc_stat.p_wait[inst] / proc_stat.hz); break; case 31: /* kernel.percpu.cpu.intr */ - atom->ul = 1000 * ((double)proc_stat.p_irq[inst] + - (double)proc_stat.p_sirq[inst]) / proc_stat.hz; + _pm_assign_utype(_pm_cputime_size, atom, + 1000 * ((double)proc_stat.p_irq[inst] + + (double)proc_stat.p_sirq[inst]) / proc_stat.hz); break; case 34: /* kernel.all.cpu.intr */ - atom->ul = 1000 * ((double)proc_stat.irq + - (double)proc_stat.sirq) / proc_stat.hz; + _pm_assign_utype(_pm_cputime_size, atom, + 1000 * ((double)proc_stat.irq + + (double)proc_stat.sirq) / proc_stat.hz); break; case 35: /* kernel.all.cpu.wait.total */ - atom->ul = 1000 * (double)proc_stat.wait / proc_stat.hz; + _pm_assign_utype(_pm_cputime_size, atom, + 1000 * (double)proc_stat.wait / proc_stat.hz); break; case 32: /* hinv.ncpu */ atom->ul = indomtab[CPU_INDOM].it_numinst; @@ -4319,6 +4349,8 @@ void linux_init(pmdaInterface *dp) { int need_refresh[NUM_CLUSTERS]; + int i, major, minor; + __pmID_int *idp; _pm_system_pagesize = getpagesize(); if (_isDSO) { @@ -4343,6 +4375,65 @@ linux_init(pmdaInterface *dp) proc_scsi.scsi_indom = &indomtab[SCSI_INDOM]; proc_slabinfo.indom = &indomtab[SLAB_INDOM]; + /* + * Figure out kernel version. The precision of certain metrics + * (e.g. percpu time counters) has changed over kernel versions. + * See include/linux/kernel_stat.h for all the various flavours. + */ + uname(&kernel_uname); + _pm_ctxt_size = 8; + _pm_intr_size = 8; + _pm_cputime_size = 8; + _pm_idletime_size = 8; + if (sscanf(kernel_uname.release, "%d.%d", &major, &minor) == 2) { + if (major < 2 || (major == 2 && minor <= 4)) { /* 2.4 and earlier */ + fprintf(stderr, "NOTICE: using kernel 2.4 or earlier CPU types\n"); + _pm_ctxt_size = 4; + _pm_intr_size = 4; + _pm_cputime_size = 4; + _pm_idletime_size = sizeof(unsigned long); + } + else if (major == 2 && minor >= 0 && minor <= 4) { /* 2.6.0->.4 */ + fprintf(stderr, "NOTICE: using kernel 2.6.0 to 2.6.4 CPU types\n"); + _pm_cputime_size = 4; + _pm_idletime_size = 4; + } + else + fprintf(stderr, "NOTICE: using 64 bit CPU time types\n"); + } + for (i = 0; i < sizeof(metrictab)/sizeof(metrictab[0]); i++) { + idp = (__pmID_int *)&(metrictab[i].m_desc.pmid); + if (idp->cluster == CLUSTER_STAT) { + switch (idp->item) { + case 0: /* kernel.percpu.cpu.user */ + case 1: /* kernel.percpu.cpu.nice */ + case 2: /* kernel.percpu.cpu.sys */ + case 20: /* kernel.all.cpu.user */ + case 21: /* kernel.all.cpu.nice */ + case 22: /* kernel.all.cpu.sys */ + case 30: /* kernel.percpu.cpu.wait.total */ + case 31: /* kernel.percpu.cpu.intr */ + case 34: /* kernel.all.cpu.intr */ + case 35: /* kernel.all.cpu.wait.total */ + _pm_metric_type(metrictab[i].m_desc.type, _pm_cputime_size); + break; + case 3: /* kernel.percpu.cpu.idle */ + case 23: /* kernel.all.cpu.idle */ + _pm_metric_type(metrictab[i].m_desc.type, _pm_idletime_size); + break; + case 12: /* kernel.all.intr */ + _pm_metric_type(metrictab[i].m_desc.type, _pm_intr_size); + break; + case 13: /* kernel.all.pswitch */ + _pm_metric_type(metrictab[i].m_desc.type, _pm_ctxt_size); + break; + } + } + if (metrictab[i].m_desc.type == PM_TYPE_NOSUPPORT) + fprintf(stderr, "Bad kernel metric descriptor type (%u.%u)\n", + idp->cluster, idp->item); + } + /* * Read System.map and /proc/ksyms. Used to translate wait channel * addresses to symbol names. Index: devel-pcp-2.5.99/src/pmdas/linux/proc_stat.c =================================================================== --- devel-pcp-2.5.99.orig/src/pmdas/linux/proc_stat.c 2006-11-21 14:37:38.612722500 +1100 +++ devel-pcp-2.5.99/src/pmdas/linux/proc_stat.c 2006-11-21 17:11:19.965021000 +1100 @@ -64,9 +64,6 @@ refresh_proc_stat(proc_cpuinfo_t *proc_c int n; int i; int j; - unsigned long cur_idle; - struct utsname kversion; - static int have_long_idle_metric = -1; if ((fd = open("/proc/stat", O_RDONLY)) < 0) { return -errno; @@ -90,17 +87,6 @@ refresh_proc_stat(proc_cpuinfo_t *proc_c bufindex = (char **)malloc(maxbufindex * sizeof(char *)); } - if (have_long_idle_metric < 0) { - memset(&kversion, 0, sizeof(kversion)); - if (uname(&kversion) == 0) { - have_long_idle_metric = - strncmp(kversion.release, "2.2", 3) == 0 || - strncmp(kversion.release, "2.4", 3) == 0; - } - else - fprintf(stderr, "refresh_proc_stat: warning: uname() failed: %s\n", strerror(errno)); - } - nbufindex = 0; bufindex[nbufindex++] = statbuf; for (i=0; i < n; i++) { @@ -142,68 +128,40 @@ refresh_proc_stat(proc_cpuinfo_t *proc_c proc_stat->cpu_indom->it_set[i].i_name = cpu_name(proc_cpuinfo, i); } - /* - * All cpu metrics are "unsigned int" except idle, - * which is "unsigned long" on 2.4.x kernels and - * "unsigned int" on 2.6.x kernels. - */ - n = proc_stat->ncpu * sizeof(unsigned int); - proc_stat->p_user = (unsigned int *)malloc(n); - proc_stat->p_nice = (unsigned int *)malloc(n); - proc_stat->p_sys = (unsigned int *)malloc(n); - proc_stat->p_wait = (unsigned int *)malloc(n); - proc_stat->p_irq = (unsigned int *)malloc(n); - proc_stat->p_sirq = (unsigned int *)malloc(n); + n = proc_stat->ncpu * sizeof(unsigned long long); + proc_stat->p_user = (unsigned long long *)malloc(n); + proc_stat->p_nice = (unsigned long long *)malloc(n); + proc_stat->p_sys = (unsigned long long *)malloc(n); + proc_stat->p_idle = (unsigned long long *)malloc(n); + proc_stat->p_wait = (unsigned long long *)malloc(n); + proc_stat->p_irq = (unsigned long long *)malloc(n); + proc_stat->p_sirq = (unsigned long long *)malloc(n); memset(proc_stat->p_user, 0, n); memset(proc_stat->p_nice, 0, n); memset(proc_stat->p_sys, 0, n); + memset(proc_stat->p_idle, 0, n); memset(proc_stat->p_wait, 0, n); memset(proc_stat->p_irq, 0, n); memset(proc_stat->p_sirq, 0, n); - - n = proc_stat->ncpu * sizeof(unsigned long); - proc_stat->p_idle = (unsigned long *)malloc(n); - proc_stat->p_prev_idle = (unsigned long *)malloc(n); - memset(proc_stat->p_idle, 0, n); - memset(proc_stat->p_prev_idle, 0, n); } - /* * cpu 95379 4 20053 6502503 * 2.6 kernels have 3 additional fields * for wait, irq and soft_irq. */ - strcpy(fmt, "cpu %u %u %u %lu %u %u %u"); + strcpy(fmt, "cpu %llu %llu %llu %llu %llu %llu %llu"); n = sscanf((const char *)bufindex[0], fmt, &proc_stat->user, &proc_stat->nice, - &proc_stat->sys, &cur_idle, + &proc_stat->sys, &proc_stat->idle, &proc_stat->wait, &proc_stat->irq, &proc_stat->sirq); if (n == 4) proc_stat->wait = proc_stat->irq = proc_stat->sirq = 0; - if (cur_idle >= proc_stat->prev_idle) - proc_stat->idle += cur_idle - proc_stat->prev_idle; - else { - /* - * For 2.6.x kernels, idle counters always wrap at 32 bits. - * For 2.4 kernels, idle counters wrap at either 32 bits - * or 64 bits, depending on sizeof(long). - */ - if (have_long_idle_metric) - proc_stat->idle += cur_idle + - (ULONG_MAX - proc_stat->prev_idle); - else - proc_stat->idle += cur_idle + - (UINT_MAX - proc_stat->prev_idle); - } - proc_stat->prev_idle = cur_idle; - /* * per-cpu stats * e.g. cpu0 95379 4 20053 6502503 - * * 2.6 kernels have 3 additional fields * for wait, irq and soft_irq. */ @@ -223,36 +181,26 @@ refresh_proc_stat(proc_cpuinfo_t *proc_c proc_stat->p_sirq[0] = proc_stat->sirq; } else { + strcpy(fmt, "cpu%d %llu %llu %llu %llu %llu %llu %llu"); for (i=0; i < proc_stat->ncpu; i++) { for (j=0; j < nbufindex; j++) { if (strncmp("cpu", bufindex[j], 3) == 0 && isdigit(bufindex[j][3])) { int c; int cpunum = atoi(&bufindex[j][3]); if (cpunum >= 0 && cpunum < proc_stat->ncpu) { - n = sscanf(bufindex[j], "cpu%d %u %u %u %lu %u %u %u", &c, - &proc_stat->p_user[cpunum], &proc_stat->p_nice[cpunum], - &proc_stat->p_sys[cpunum], &cur_idle, - &proc_stat->p_wait[cpunum], &proc_stat->p_irq[cpunum], + n = sscanf(bufindex[j], fmt, &c, + &proc_stat->p_user[cpunum], + &proc_stat->p_nice[cpunum], + &proc_stat->p_sys[cpunum], + &proc_stat->p_idle[cpunum], + &proc_stat->p_wait[cpunum], + &proc_stat->p_irq[cpunum], &proc_stat->p_sirq[cpunum]); - if (n == 4) { proc_stat->p_wait[cpunum] = proc_stat->p_irq[cpunum] = proc_stat->p_sirq[cpunum] = 0; } - - if (cur_idle >= proc_stat->p_prev_idle[cpunum]) - proc_stat->p_idle[cpunum] += cur_idle - proc_stat->p_prev_idle[cpunum]; - else { - /* wrapped, see comment above */ - if (have_long_idle_metric) - proc_stat->p_idle[cpunum] += cur_idle + - (ULONG_MAX - proc_stat->p_prev_idle[cpunum]); - else - proc_stat->p_idle[cpunum] += cur_idle + - (UINT_MAX - proc_stat->p_prev_idle[cpunum]); - } - proc_stat->p_prev_idle[cpunum] = cur_idle; } } } @@ -292,10 +240,10 @@ refresh_proc_stat(proc_cpuinfo_t *proc_c * intr 32845463 24099228 2049 0 2 .... * (just export the first number, which is total interrupts) */ - strcpy(fmt, "intr %u"); + strcpy(fmt, "intr %llu"); for (j=0; j < nbufindex; j++) { if (strncmp(fmt, bufindex[j], 5) == 0) { - sscanf((const char *)bufindex[j], fmt, &proc_stat->intr[0]); + sscanf((const char *)bufindex[j], fmt, &proc_stat->intr); break; } } @@ -303,7 +251,7 @@ refresh_proc_stat(proc_cpuinfo_t *proc_c /* * ctxt 1733480 */ - strcpy(fmt, "ctxt %lu"); + strcpy(fmt, "ctxt %llu"); for (j=0; j < nbufindex; j++) { if (strncmp(fmt, bufindex[j], 5) == 0) { sscanf((const char *)bufindex[j], fmt, &proc_stat->ctxt); Index: devel-pcp-2.5.99/src/pmdas/linux/proc_stat.h =================================================================== --- devel-pcp-2.5.99.orig/src/pmdas/linux/proc_stat.h 2006-11-21 13:58:32.434095500 +1100 +++ devel-pcp-2.5.99/src/pmdas/linux/proc_stat.h 2006-11-21 15:30:23.214497750 +1100 @@ -24,18 +24,14 @@ #ident "$Id: proc_stat.h,v 1.13 2005/05/10 05:37:54 markgw Exp $" typedef struct { - unsigned int user, sys, nice, wait, irq, sirq; - unsigned long idle; - unsigned long prev_idle; + unsigned long long user, sys, nice, idle, wait, irq, sirq; unsigned int ncpu; - unsigned int *p_user, *p_sys, *p_nice, *p_wait, *p_irq, *p_sirq; - unsigned long *p_idle; - unsigned long *p_prev_idle; + unsigned long long *p_user, *p_sys, *p_nice, *p_idle, *p_wait, *p_irq, *p_sirq; unsigned int ndisk; - unsigned int page[2]; - unsigned int swap[2]; - unsigned int intr[65]; - unsigned long ctxt; + unsigned int page[2]; /* unused in 2.6, switched to /proc/vmstat */ + unsigned int swap[2]; /* unused in 2.6, switched to /proc/vmstat */ + unsigned long long intr; + unsigned long long ctxt; unsigned long btime; unsigned long processes; pmdaIndom *cpu_indom; --=-IcbJvWTwD/RGu2WiCA1n-- From nscott@aconex.com Tue Nov 21 13:15:49 2006 Received: with ECARTIS (v1.0.0; list pcp); Tue, 21 Nov 2006 13:15:58 -0800 (PST) Received: from page.mel.office.aconex.com (mail.aconex.com [150.101.159.26]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id kALLFlaG026567 for ; Tue, 21 Nov 2006 13:15:49 -0800 Received: from localhost (page.mel.aconex.com [127.0.0.1]) by page.mel.office.aconex.com (Postfix) with ESMTP id 965BF5341C0; Wed, 22 Nov 2006 08:14:59 +1100 (EST) Received: from page.mel.office.aconex.com ([127.0.0.1]) by localhost (mail.aconex.com [127.0.0.1]) (amavisd-new, port 10024) with LMTP id 15964-01-49; Wed, 22 Nov 2006 08:14:57 +1100 (EST) Received: from edge (unknown [192.168.0.246]) by page.mel.office.aconex.com (Postfix) with ESMTP id 3CC3F5341B6; Wed, 22 Nov 2006 08:14:57 +1100 (EST) Subject: Re: [PATCH] Fix Linux PMDA CPU time metrics From: Nathan Scott Reply-To: nscott@aconex.com To: Keith Owens Cc: Michael Newton , Mark Goodwin , pcp@oss.sgi.com In-Reply-To: <12630.1164093933@kao2.melbourne.sgi.com> References: <12630.1164093933@kao2.melbourne.sgi.com> Content-Type: text/plain Organization: Aconex Date: Wed, 22 Nov 2006 08:15:47 +1100 Message-Id: <1164143748.4695.302.camel@edge> Mime-Version: 1.0 X-Mailer: Evolution 2.6.3 Content-Transfer-Encoding: 7bit X-archive-position: 591 X-ecartis-version: Ecartis v1.0.0 Sender: pcp-bounce@oss.sgi.com Errors-to: pcp-bounce@oss.sgi.com X-original-sender: nscott@aconex.com Precedence: bulk X-list: pcp On Tue, 2006-11-21 at 18:25 +1100, Keith Owens wrote: > Nathan Scott (on Tue, 21 Nov 2006 17:32:19 +1100) wrote: > >Finally, theres a new CPU time being accounted in recent 2.6 kernels > >("steal") - I've not updated the agent to export that as yet (it is > >always zero on my boxen). > > The steal time is set in account_steal_time() which is only called from > ppc and s390 systems. IOW, from those architectures that have built in > hypervisors. I would expect the steal time to start being set under > Xen and similar hypervisors, so it should be added to PCP. > > -- Nathan From nscott@aconex.com Tue Nov 21 13:20:54 2006 Received: with ECARTIS (v1.0.0; list pcp); Tue, 21 Nov 2006 13:21:03 -0800 (PST) Received: from page.mel.office.aconex.com (mail.aconex.com [150.101.159.26]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id kALLKoaG027828 for ; Tue, 21 Nov 2006 13:20:53 -0800 Received: from localhost (page.mel.aconex.com [127.0.0.1]) by page.mel.office.aconex.com (Postfix) with ESMTP id 873805341C8; Wed, 22 Nov 2006 08:20:03 +1100 (EST) Received: from page.mel.office.aconex.com ([127.0.0.1]) by localhost (mail.aconex.com [127.0.0.1]) (amavisd-new, port 10024) with LMTP id 15964-01-68; Wed, 22 Nov 2006 08:20:02 +1100 (EST) Received: from edge (unknown [192.168.0.246]) by page.mel.office.aconex.com (Postfix) with ESMTP id 8B8265341C6; Wed, 22 Nov 2006 08:20:02 +1100 (EST) Subject: Re: [PATCH] Fix Linux PMDA CPU time metrics From: Nathan Scott Reply-To: nscott@aconex.com To: Keith Owens Cc: Michael Newton , Mark Goodwin , pcp@oss.sgi.com In-Reply-To: <12630.1164093933@kao2.melbourne.sgi.com> References: <12630.1164093933@kao2.melbourne.sgi.com> Content-Type: text/plain Organization: Aconex Date: Wed, 22 Nov 2006 08:20:53 +1100 Message-Id: <1164144053.4695.307.camel@edge> Mime-Version: 1.0 X-Mailer: Evolution 2.6.3 Content-Transfer-Encoding: 7bit X-archive-position: 592 X-ecartis-version: Ecartis v1.0.0 Sender: pcp-bounce@oss.sgi.com Errors-to: pcp-bounce@oss.sgi.com X-original-sender: nscott@aconex.com Precedence: bulk X-list: pcp On Tue, 2006-11-21 at 18:25 +1100, Keith Owens wrote: > Nathan Scott (on Tue, 21 Nov 2006 17:32:19 +1100) wrote: > >Finally, theres a new CPU time being accounted in recent 2.6 kernels > >("steal") - I've not updated the agent to export that as yet (it is > >always zero on my boxen). > > The steal time is set in account_steal_time() which is only called from > ppc and s390 systems. IOW, from those architectures that have built in > hypervisors. I would expect the steal time to start being set under > Xen and similar hypervisors, so it should be added to PCP. Ah, I see - thanks Keith. I'll cook up a patch when I get a chance. cheers. -- Nathan From nscott@aconex.com Thu Nov 23 14:32:09 2006 Received: with ECARTIS (v1.0.0; list pcp); Thu, 23 Nov 2006 14:32:16 -0800 (PST) Received: from page.mel.office.aconex.com (mail.aconex.com [150.101.159.26]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id kANMW7aG025304 for ; Thu, 23 Nov 2006 14:32:09 -0800 Received: from localhost (page.mel.aconex.com [127.0.0.1]) by page.mel.office.aconex.com (Postfix) with ESMTP id 45B3C534261; Fri, 24 Nov 2006 09:31:15 +1100 (EST) Received: from page.mel.office.aconex.com ([127.0.0.1]) by localhost (mail.aconex.com [127.0.0.1]) (amavisd-new, port 10024) with LMTP id 21845-01-78; Fri, 24 Nov 2006 09:31:14 +1100 (EST) Received: from edge (unknown [192.168.0.246]) by page.mel.office.aconex.com (Postfix) with ESMTP id 27E48534251; Fri, 24 Nov 2006 09:31:14 +1100 (EST) Subject: Re: [PATCH 01/12] Fix Windows PMDA build From: Nathan Scott Reply-To: nscott@aconex.com To: Michael Newton Cc: Mark Goodwin , pcp@oss.sgi.com In-Reply-To: <1163999314.4695.231.camel@edge> References: <1163999314.4695.231.camel@edge> Content-Type: text/plain Organization: Aconex Date: Fri, 24 Nov 2006 09:32:11 +1100 Message-Id: <1164321131.4695.368.camel@edge> Mime-Version: 1.0 X-Mailer: Evolution 2.6.3 Content-Transfer-Encoding: 7bit X-archive-position: 593 X-ecartis-version: Ecartis v1.0.0 Sender: pcp-bounce@oss.sgi.com Errors-to: pcp-bounce@oss.sgi.com X-original-sender: nscott@aconex.com Precedence: bulk X-list: pcp On Mon, 2006-11-20 at 16:08 +1100, Nathan Scott wrote: > Currently the Makefile for the Windows agent is in a half-broken state, > it assumes in some places that uuencoding has been done on certain files > where it has not (uuencode/uudecode doesn't seem to be part of a default > Cygwin install either). > > This reverts the Windows PMDA Makefile to its earlier, working state. Don't worry about this patch for now, I'll send in a patch early next week to fix this up properly. cheers. -- Nathan From nscott@aconex.com Sun Nov 26 22:02:58 2006 Received: with ECARTIS (v1.0.0; list pcp); Sun, 26 Nov 2006 22:03:05 -0800 (PST) Received: from page.mel.office.aconex.com (mail.aconex.com [150.101.159.26]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id kAR62taG029316 for ; Sun, 26 Nov 2006 22:02:57 -0800 Received: from localhost (page.mel.aconex.com [127.0.0.1]) by page.mel.office.aconex.com (Postfix) with ESMTP id 28D205341DD; Mon, 27 Nov 2006 17:01:28 +1100 (EST) Received: from page.mel.office.aconex.com ([127.0.0.1]) by localhost (mail.aconex.com [127.0.0.1]) (amavisd-new, port 10024) with LMTP id 32357-01-93; Mon, 27 Nov 2006 17:01:27 +1100 (EST) Received: from edge (unknown [192.168.0.246]) by page.mel.office.aconex.com (Postfix) with ESMTP id 4B0CF53410F; Mon, 27 Nov 2006 17:01:27 +1100 (EST) Subject: [PATCH] fix lmsensors Install script botch From: Nathan Scott Reply-To: nscott@aconex.com To: Michael Newton , Mark Goodwin Cc: pcp@oss.sgi.com Content-Type: multipart/mixed; boundary="=-baw+HYMSpvrMJJA2eNNv" Organization: Aconex Date: Mon, 27 Nov 2006 17:03:36 +1100 Message-Id: <1164607416.4695.420.camel@edge> Mime-Version: 1.0 X-Mailer: Evolution 2.6.3 X-archive-position: 595 X-ecartis-version: Ecartis v1.0.0 Sender: pcp-bounce@oss.sgi.com Errors-to: pcp-bounce@oss.sgi.com X-original-sender: nscott@aconex.com Precedence: bulk X-list: pcp --=-baw+HYMSpvrMJJA2eNNv Content-Type: text/plain Content-Transfer-Encoding: 7bit Looks like this script was based on that of the simple PMDA, but the socket number was not updated... with the attached patch, we now have unique socket numbers again: $ grep socket_inet_def src/pmdas/*/Install src/pmdas/lmsensors/Install:socket_inet_def=2079 src/pmdas/sample/Install:socket_inet_def=2077 # default TCP port for Internet socket IPC src/pmdas/simple/Install:socket_inet_def=2078 src/pmdas/weblog/Install:socket_inet_def=2080 # default TCP port for Internet socket IPC cheers. -- Nathan --=-baw+HYMSpvrMJJA2eNNv Content-Disposition: attachment; filename=fix-lmsensors-socket_inet_def Content-Type: text/x-patch; name=fix-lmsensors-socket_inet_def; charset=UTF-8 Content-Transfer-Encoding: 7bit Index: devel-pcp-2.5.99/src/pmdas/lmsensors/Install =================================================================== --- devel-pcp-2.5.99.orig/src/pmdas/lmsensors/Install 2006-11-22 11:34:41.042329750 +1100 +++ devel-pcp-2.5.99/src/pmdas/lmsensors/Install 2006-11-22 11:34:45.498608250 +1100 @@ -44,7 +44,7 @@ pmdaSetup dso_opt=true socket_opt=true -socket_inet_def=2078 +socket_inet_def=2079 pmdaInstall --=-baw+HYMSpvrMJJA2eNNv-- From nscott@aconex.com Sun Nov 26 22:02:50 2006 Received: with ECARTIS (v1.0.0; list pcp); Sun, 26 Nov 2006 22:02:57 -0800 (PST) Received: from page.mel.office.aconex.com (mail.aconex.com [150.101.159.26]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id kAR62laG029305 for ; Sun, 26 Nov 2006 22:02:49 -0800 Received: from localhost (page.mel.aconex.com [127.0.0.1]) by page.mel.office.aconex.com (Postfix) with ESMTP id 368E1534172; Mon, 27 Nov 2006 17:01:22 +1100 (EST) Received: from page.mel.office.aconex.com ([127.0.0.1]) by localhost (mail.aconex.com [127.0.0.1]) (amavisd-new, port 10024) with LMTP id 01670-01-64; Mon, 27 Nov 2006 17:01:21 +1100 (EST) Received: from edge (unknown [192.168.0.246]) by page.mel.office.aconex.com (Postfix) with ESMTP id 4891753410F; Mon, 27 Nov 2006 17:01:21 +1100 (EST) Subject: [PATCH] fix dbpmda agent execvp use From: Nathan Scott Reply-To: nscott@aconex.com To: Michael Newton , Mark Goodwin Cc: pcp@oss.sgi.com Content-Type: multipart/mixed; boundary="=-8Aq5rv/42H+l2h4ZaTvP" Organization: Aconex Date: Mon, 27 Nov 2006 17:03:30 +1100 Message-Id: <1164607410.4695.419.camel@edge> Mime-Version: 1.0 X-Mailer: Evolution 2.6.3 X-archive-position: 594 X-ecartis-version: Ecartis v1.0.0 Sender: pcp-bounce@oss.sgi.com Errors-to: pcp-bounce@oss.sgi.com X-original-sender: nscott@aconex.com Precedence: bulk X-list: pcp --=-8Aq5rv/42H+l2h4ZaTvP Content-Type: text/plain Content-Transfer-Encoding: 7bit Hi guys, When using dbpmda with a new agent thats being invoked with no arguments I'm hitting a problem where the execvp(2) syscall is not being made with a NULL terminated argv. This causes spurious bogus arguments to be on the agents command line, and messes up their argv/argc handling. This is fixed by the attached patch. It looks like a coding oversight, as this extra addarglist(NULL) _is_ there for the non-short-circuit case further down in doargs(). cheers. -- Nathan --=-8Aq5rv/42H+l2h4ZaTvP Content-Disposition: attachment; filename=fix-dbpmda-args Content-Type: text/x-patch; name=fix-dbpmda-args; charset=UTF-8 Content-Transfer-Encoding: 7bit Index: devel-pcp-2.5.99/src/dbpmda/src/lex.l =================================================================== --- devel-pcp-2.5.99.orig/src/dbpmda/src/lex.l 2006-11-27 16:19:02.998948000 +1100 +++ devel-pcp-2.5.99/src/dbpmda/src/lex.l 2006-11-27 16:19:46.533668750 +1100 @@ -429,8 +429,10 @@ doargs(void) initarglist(); - if (lastc == '\n') + if (lastc == '\n') { + addarglist(NULL); return; + } p = buf; for ( ; ; ) { --=-8Aq5rv/42H+l2h4ZaTvP-- From nscott@aconex.com Sun Nov 26 22:09:42 2006 Received: with ECARTIS (v1.0.0; list pcp); Sun, 26 Nov 2006 22:09:49 -0800 (PST) Received: from page.mel.office.aconex.com (mail.aconex.com [150.101.159.26]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id kAR69eaG029986 for ; Sun, 26 Nov 2006 22:09:41 -0800 Received: from localhost (page.mel.aconex.com [127.0.0.1]) by page.mel.office.aconex.com (Postfix) with ESMTP id 334E953410F; Mon, 27 Nov 2006 17:08:16 +1100 (EST) Received: from page.mel.office.aconex.com ([127.0.0.1]) by localhost (mail.aconex.com [127.0.0.1]) (amavisd-new, port 10024) with LMTP id 04893-01-27; Mon, 27 Nov 2006 17:08:14 +1100 (EST) Received: from edge (unknown [192.168.0.246]) by page.mel.office.aconex.com (Postfix) with ESMTP id 8B56B5341DD; Mon, 27 Nov 2006 17:08:14 +1100 (EST) Subject: [PATCH] fix new gcc warnings from sginap From: Nathan Scott Reply-To: nscott@aconex.com To: Michael Newton , Mark Goodwin Cc: pcp@oss.sgi.com Content-Type: multipart/mixed; boundary="=-OdbbX0i28kMzV3NWRvg4" Organization: Aconex Date: Mon, 27 Nov 2006 17:10:23 +1100 Message-Id: <1164607824.4695.426.camel@edge> Mime-Version: 1.0 X-Mailer: Evolution 2.6.3 X-archive-position: 596 X-ecartis-version: Ecartis v1.0.0 Sender: pcp-bounce@oss.sgi.com Errors-to: pcp-bounce@oss.sgi.com X-original-sender: nscott@aconex.com Precedence: bulk X-list: pcp --=-OdbbX0i28kMzV3NWRvg4 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Several of the tools (e.g. pmie, pmval, and others) are producing gcc warnings like this (I'm using a gcc 4.1.2 pre-release here)... pmval.c: In function ‘sleeptill’: pmval.c:233: warning: value computed is not used pmval.c:233: warning: value computed is not used Here's what looks like the simplest way to fix this, since the return value that we're computing-by-casting-but-not-using really isn't used anywhere in the entire source tree. cheers. -- Nathan --=-OdbbX0i28kMzV3NWRvg4 Content-Disposition: attachment; filename=fix-sginap-warnings Content-Type: text/x-patch; name=fix-sginap-warnings; charset=UTF-8 Content-Transfer-Encoding: 8bit === pmval === gcc -fpic -fno-strict-aliasing -Wall -g -DPCP_DEBUG -DPCP_VERSION=\"2.5.99\" -D_GNU_SOURCE -I../../src/include -c -o pmval.o pmval.c pmval.c: In function ‘sleeptill’: pmval.c:233: warning: value computed is not used pmval.c:233: warning: value computed is not used pmval.c: In function ‘main’: pmval.c:1631: warning: value computed is not used pmval.c:1631: warning: value computed is not used Index: devel-pcp-2.5.99/src/include/platform_defs.h.in =================================================================== --- devel-pcp-2.5.99.orig/src/include/platform_defs.h.in 2006-11-20 17:20:11.794595750 +1100 +++ devel-pcp-2.5.99/src/include/platform_defs.h.in 2006-11-20 17:20:22.703277500 +1100 @@ -341,7 +341,7 @@ bozo! need to find where MAXHOSTNAMELEN #ifndef HAVE_SGINAP #ifdef HAVE_USLEEP -#define sginap(x) ((long)usleep((long)((double)1000000.0 * x / CLK_TCK)),0) +#define sginap(x) (usleep((long)((double)1000000.0 * x / CLK_TCK)),0) #else #define sginap(x) sleep(x/CLK_TCK) #endif --=-OdbbX0i28kMzV3NWRvg4-- From nscott@aconex.com Wed Nov 29 17:09:12 2006 Received: with ECARTIS (v1.0.0; list pcp); Wed, 29 Nov 2006 17:09:19 -0800 (PST) Received: from page.mel.office.aconex.com (eth2333.vic.adsl.internode.on.net [150.101.159.28]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id kAU197aG014183 for ; Wed, 29 Nov 2006 17:09:09 -0800 Received: from localhost (page.mel.aconex.com [127.0.0.1]) by page.mel.office.aconex.com (Postfix) with ESMTP id 2E87853423F; Thu, 30 Nov 2006 12:08:16 +1100 (EST) Received: from page.mel.office.aconex.com ([127.0.0.1]) by localhost (mail.aconex.com [127.0.0.1]) (amavisd-new, port 10024) with LMTP id 04854-01-40; Thu, 30 Nov 2006 12:08:05 +1100 (EST) Received: from edge (unknown [192.168.0.246]) by page.mel.office.aconex.com (Postfix) with ESMTP id 91DDF534278; Thu, 30 Nov 2006 12:08:05 +1100 (EST) Subject: [PATCH] pmie support for stomp messages From: Nathan Scott Reply-To: nscott@aconex.com To: Michael Newton , Mark Goodwin Cc: pcp@oss.sgi.com Content-Type: multipart/mixed; boundary="=-54hwPMB8NXFNYD/KXnI2" Organization: Aconex Date: Thu, 30 Nov 2006 12:06:56 +1100 Message-Id: <1164848816.4992.65.camel@edge> Mime-Version: 1.0 X-Mailer: Evolution 2.6.3 X-archive-position: 597 X-ecartis-version: Ecartis v1.0.0 Sender: pcp-bounce@oss.sgi.com Errors-to: pcp-bounce@oss.sgi.com X-original-sender: nscott@aconex.com Precedence: bulk X-list: pcp --=-54hwPMB8NXFNYD/KXnI2 Content-Type: text/plain Content-Transfer-Encoding: 7bit Hi all, The following patch adds pmie support for the "Streaming Text Orientated Messaging Protocol" - see (http://stomp.codehaus.org/) - which allows us to generate performance events from pmie, pass them to a JMS server that talks STOMP, and then share those events with (many) interested parties. The implementation here extends the pmie language by adding a "stomp" action for rules (in addition to the current syslog, print, alarm, etc). If any "stomp" actions are presented in the pmie configuration file, we connect to the JMS server from /var/lib/pcp/config/pmie/stomp on startup (or fail). Truthful stomp rule evaluation results in the user-defined message being sent to the server at that time. There is logic also to attempt reconnection to the JMS server should it be unavailable (this uses an "opportunistic" approach - only when a stomp rule evaluates to true, and at most one reconnect attempt per minute). Here's some example rules, monitoring our production machines atm: shell_ping_commands_bad = some_inst ( nomatch_inst "mel-http*" (shping.status > 0) ) -> print "ERROR: %i command unsuccessful\n" & stomp "ERROR: Shell Ping: %i command unsuccessful\n"; http_login_success = some_inst ( shping.status #'mel-http' #'mel-https' > 0 ) -> print "ERROR: %i failed to login, instance is down\n" & stomp "ERROR: Shell Ping: %i failed to login, instance is down\n"; Finally, there's another piece of software to "catch" the events being made available by the JMS server - in our case, we have a Java applet that uses JDIC to integrate into the desktop control panel and provide balloon popups when rules are triggered (as well as a small amount of event history, and severity thresholds). Thats written in Java though, so we'll add a SourceForge PCP add-on project for that (its completely independent, and doesn't really belong in PCP itself). This patch adds no new dependencies on PCP/pmie though, it uses only POSIX APIs already used by pmie and libpcp. cheers. -- Nathan --=-54hwPMB8NXFNYD/KXnI2 Content-Disposition: attachment; filename=stomp-pmie-final Content-Type: text/x-patch; name=stomp-pmie-final; charset=UTF-8 Content-Transfer-Encoding: 7bit Index: final-hacka-pcp-2.5.99/src/pmie/src/act.sk =================================================================== --- final-hacka-pcp-2.5.99.orig/src/pmie/src/act.sk 2006-11-27 10:00:42.884279500 +1100 +++ final-hacka-pcp-2.5.99/src/pmie/src/act.sk 2006-11-30 11:19:55.637939250 +1100 @@ -1,5 +1,6 @@ /* * Copyright (c) 1995-2002 Silicon Graphics, Inc. All Rights Reserved. + * Portions copyright (c) 2006 Aconex. All Rights Reserved. * * This program is free software; you can redistribute it and/or modify it * under the terms of the GNU General Public License as published by the @@ -279,6 +280,30 @@ actPrint(Expr *x) /* + * operator: actStomp + */ +void +actStomp(Expr *x) +{ + Expr *arg1 = x->arg1; + Expr *arg2 = x->arg2; + + if ((arg2 == NULL) || + (x->smpls[0].stamp == 0) || + (now >= *(RealTime *)arg2->ring + x->smpls[0].stamp)) + { + EVALARG(arg1) + x->smpls[0].stamp = now; + if (stompSend((const char *)arg1->ring) != 0) + *(Truth *)x->ring = FALSE; + else + *(Truth *)x->ring = TRUE; + perf->actions++; + } +} + + +/* * fake actions for archive mode */ void Index: final-hacka-pcp-2.5.99/src/pmie/src/dstruct.h =================================================================== --- final-hacka-pcp-2.5.99.orig/src/pmie/src/dstruct.h 2006-11-27 10:00:42.888279750 +1100 +++ final-hacka-pcp-2.5.99/src/pmie/src/dstruct.h 2006-11-27 10:02:35.063290250 +1100 @@ -298,6 +298,7 @@ typedef int Op; #define ACT_SYSLOG 74 #define ACT_PRINT 75 #define ACT_ARG 76 +#define ACT_STOMP 77 /* no operation (extension) */ #define NOP 80 /* dereferenced variable */ Index: final-hacka-pcp-2.5.99/src/pmie/src/grammar.y =================================================================== --- final-hacka-pcp-2.5.99.orig/src/pmie/src/grammar.y 2006-11-27 10:00:42.916281500 +1100 +++ final-hacka-pcp-2.5.99/src/pmie/src/grammar.y 2006-11-30 11:20:08.806762250 +1100 @@ -3,6 +3,7 @@ *********************************************************************** * * Copyright (c) 1995 Silicon Graphics, Inc. All Rights Reserved. + * Portions copyright (c) 2006 Aconex. All Rights Reserved. * * This program is free software; you can redistribute it and/or modify it * under the terms of the GNU General Public License as published by the @@ -30,6 +31,7 @@ #include "lexicon.h" #include "pragmatics.h" #include "syslog.h" +#include "stomp.h" #include "show.h" /* strings for error reporting */ @@ -69,6 +71,7 @@ gramerr(char *phrase, char *pos, char *o %token ALARM %token SYSLOG %token PRINT +%token STOMP %token SOME_QUANT %token ALL_QUANT %token PCNT_QUANT @@ -137,7 +140,7 @@ gramerr(char *phrase, char *pos, char *o %left '*' '/' %left UMINUS RATE %left SUM_AGGR AVG_AGGR MAX_AGGR MIN_AGGR COUNT_AGGR -%left SHELL ALARM SYSLOG PRINT +%left SHELL ALARM SYSLOG PRINT STOMP %left ':' '#' '@' %left UNIT_SLASH INTERVAL @@ -222,6 +225,16 @@ act : '(' act ')' { $$ = actExpr(ACT_PRINT, $2, NULL); } | PRINT num actarg /* holdoff format */ { $$ = actExpr(ACT_PRINT, $3, $2); } + | STOMP actarg + { + stomping = 1; + $$ = actExpr(ACT_STOMP, $2, NULL); + } + | STOMP num actarg /* holdoff format */ + { + stomping = 1; + $$ = actExpr(ACT_STOMP, $3, $2); + } /* error reporting */ | error SEQ @@ -250,6 +263,9 @@ act : '(' act ')' | PRINT error { gramerr(tstr_str, follow, opStrings(ACT_PRINT)); $$ = NULL; } + | STOMP error + { gramerr(tstr_str, follow, opStrings(ACT_STOMP)); + $$ = NULL; } ; actarg : arglist Index: final-hacka-pcp-2.5.99/src/pmie/src/lexicon.c =================================================================== --- final-hacka-pcp-2.5.99.orig/src/pmie/src/lexicon.c 2006-11-27 10:00:42.920281750 +1100 +++ final-hacka-pcp-2.5.99/src/pmie/src/lexicon.c 2006-11-27 10:02:35.063290250 +1100 @@ -90,6 +90,7 @@ static LexEntry1 optab[] = { {"alarm", ALARM}, {"syslog", SYSLOG}, {"print", PRINT}, + {"stomp", STOMP}, {"rising", RISE}, {"falling", FALL}, {"match_inst", MATCH}, Index: final-hacka-pcp-2.5.99/src/pmie/src/show.c =================================================================== --- final-hacka-pcp-2.5.99.orig/src/pmie/src/show.c 2006-11-27 10:00:42.928282250 +1100 +++ final-hacka-pcp-2.5.99/src/pmie/src/show.c 2006-11-27 10:02:35.063290250 +1100 @@ -104,6 +104,7 @@ static struct { { ACT_ALARM, "alarm" }, { ACT_SYSLOG, "syslog" }, { ACT_PRINT, "print" }, + { ACT_STOMP, "stomp" }, { ACT_ARG, "" }, { NOP, "" }, { OP_VAR, "" }, Index: final-hacka-pcp-2.5.99/src/pmie/src/GNUmakefile =================================================================== --- final-hacka-pcp-2.5.99.orig/src/pmie/src/GNUmakefile 2006-11-27 10:00:42.932282500 +1100 +++ final-hacka-pcp-2.5.99/src/pmie/src/GNUmakefile 2006-11-27 10:02:35.063290250 +1100 @@ -35,10 +35,10 @@ include $(TOPDIR)/src/include/builddefs TARGET = pmie$(EXECSUFFIX) CFILES = pmie.c symbol.c dstruct.c lexicon.c syntax.c pragmatics.c eval.c \ - show.c match_inst.c syslog.c + show.c match_inst.c syslog.c stomp.c HFILES = fun.h dstruct.h eval.h lexicon.h pmiestats.h pragmatics.h \ - show.h symbol.h syntax.h syslog.h + show.h symbol.h syntax.h syslog.h stomp.h SKELETAL = hdr.sk fetch.sk misc.sk aggregate.sk unary.sk binary.sk \ merge.sk act.sk Index: final-hacka-pcp-2.5.99/src/pmie/src/fun.h =================================================================== --- final-hacka-pcp-2.5.99.orig/src/pmie/src/fun.h 2006-11-27 10:00:42.940283000 +1100 +++ final-hacka-pcp-2.5.99/src/pmie/src/fun.h 2006-11-27 10:02:35.063290250 +1100 @@ -128,6 +128,7 @@ void actShell(Expr *); void actAlarm(Expr *); void actSyslog(Expr *); void actPrint(Expr *); +void actStomp(Expr *); void actArg(Expr *); void actFake(Expr *); Index: final-hacka-pcp-2.5.99/src/pmie/src/stomp.h =================================================================== --- /dev/null 1970-01-01 00:00:00.000000000 +0000 +++ final-hacka-pcp-2.5.99/src/pmie/src/stomp.h 2006-11-30 11:20:39.096655250 +1100 @@ -0,0 +1,26 @@ +/* + * Copyright (c) 2006 Aconex. All Rights Reserved. + * + * This program is free software; you can redistribute it and/or modify it + * under the terms of the GNU General Public License as published by the + * Free Software Foundation; either version 2 of the License, or (at your + * option) any later version. + * + * This program is distributed in the hope that it will be useful, but + * WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY + * or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License + * for more details. + * + * You should have received a copy of the GNU General Public License along + * with this program; if not, write to the Free Software Foundation, Inc., + * 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA + */ + +/* + * Streaming Text Orientated Messaging Protocol implementation + * http://stomp.codehaus.org/ + */ +extern int stomping; /* true if stomp actions present */ +extern char *stompfile; /* stomp config file */ +extern int stompInit(void); /* connect to stomp server */ +extern int stompSend(const char *); /* send to JMS server, via stomp */ Index: final-hacka-pcp-2.5.99/src/pmie/src/pmie.c =================================================================== --- final-hacka-pcp-2.5.99.orig/src/pmie/src/pmie.c 2006-11-27 10:00:42.944283250 +1100 +++ final-hacka-pcp-2.5.99/src/pmie/src/pmie.c 2006-11-27 10:43:13.907708500 +1100 @@ -44,6 +44,7 @@ #include #include #include "dstruct.h" +#include "stomp.h" #include "syntax.h" #include "pragmatics.h" #include "eval.h" @@ -73,8 +74,20 @@ static char *intro = "Performance Co-Pi static char logfile[MAXPATHLEN+1]; static char perffile[PMIE_PATHSIZE]; /* /var/tmp/ file name */ +static char menu[] = +"pmie debugger commands\n\n" +" f [file-name] - load expressions from given file or stdin\n" +" l [expr-name] - list named expression or all expressions\n" +" r [interval] - run for given or default interval\n" +" S time-spec - set start time for run\n" +" T time-spec - set default interval for run command\n" +" v [expr-name] - print subexpression used for %h, %i and\n" +" %v bindings\n" +" h or ? - print this menu of commands\n" +" q - quit\n\n"; + static char usage[] = - "Usage: pmie [options] [filename ...]\n\n" + "Usage: %s [options] [filename ...]\n\n" "Options:\n" " -A align align sample times on natural boundaries\n" " -a archive metrics source is a PCP log archive\n" @@ -85,6 +98,7 @@ static char usage[] = " -e force timestamps to be reported when used with -V, -v or -W\n" " -f run in foreground\n" " -h host metrics source is PMCD on host\n" + " -j stompfile stomp protocol (JMS) file [default %s/config/pmie/stomp]\n" " -l logfile send status and error messages to logfile\n" " -n pmnsfile use an alternative PMNS\n" " -O offset initial offset into the time window\n" @@ -98,17 +112,16 @@ static char usage[] = " -Z timezone set reporting timezone\n" " -z set reporting timezone to local time of metrics source\n"; -static char menu[] = -"pmie debugger commands\n\n" -" f [file-name] - load expressions from given file or stdin\n" -" l [expr-name] - list named expression or all expressions\n" -" r [interval] - run for given or default interval\n" -" S time-spec - set start time for run\n" -" T time-spec - set default interval for run command\n" -" v [expr-name] - print subexpression used for %h, %i and\n" -" %v bindings\n" -" h or ? - print this menu of commands\n" -" q - quit\n\n"; +/*********************************************************************** + * usage message + ***********************************************************************/ + +static void +usageMessage(void) +{ + fprintf(stderr, usage, pmProgname, pmGetConfig("PCP_VAR_DIR")); + exit(1); +} /*********************************************************************** @@ -445,7 +458,7 @@ getargs(int argc, char *argv[]) memset(&tv2, 0, sizeof(tv2)); dstructInit(); - while ((c=getopt(argc, argv, "a:A:bc:CdD:efh:l:n:O:S:t:T:vVWXxzZ:?")) != EOF) { + while ((c=getopt(argc, argv, "a:A:bc:CdD:efh:j:l:n:O:S:t:T:vVWXxzZ:?")) != EOF) { switch (c) { case 'a': /* archives */ @@ -536,6 +549,10 @@ getargs(int argc, char *argv[]) dfltHost = optarg; break; + case 'j': /* stomp protocol (JMS) config */ + stompfile = optarg; + break; + case 'l': /* alternate log file */ if (commandlog != NULL) { fprintf(stderr, "%s: at most one -l option is allowed\n", @@ -549,7 +566,7 @@ getargs(int argc, char *argv[]) case 'n': /* alternate namespace file */ pmnsfile = optarg; - break; + break; case 'O': /* position within time window */ offsetFlag = optarg; @@ -626,10 +643,8 @@ getargs(int argc, char *argv[]) pmProgname); err++; } - if (err) { - fprintf(stderr, usage); - exit(1); - } + if (err) + usageMessage(); if (foreground) isdaemon = 0; @@ -760,6 +775,9 @@ getargs(int argc, char *argv[]) setsid(); /* not process group leader, lose controlling tty */ } + if (stomping) + stompInit(); /* connect to our message server */ + if (agent) agentInit(); /* initialize secret agent stuff */ Index: final-hacka-pcp-2.5.99/src/pmie/src/stomp.c =================================================================== --- /dev/null 1970-01-01 00:00:00.000000000 +0000 +++ final-hacka-pcp-2.5.99/src/pmie/src/stomp.c 2006-11-30 11:26:37.563058000 +1100 @@ -0,0 +1,358 @@ +/* + * Copyright (c) 2006 Aconex. All Rights Reserved. + * + * This program is free software; you can redistribute it and/or modify it + * under the terms of the GNU General Public License as published by the + * Free Software Foundation; either version 2 of the License, or (at your + * option) any later version. + * + * This program is distributed in the hope that it will be useful, but + * WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY + * or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License + * for more details. + * + * You should have received a copy of the GNU General Public License along + * with this program; if not, write to the Free Software Foundation, Inc., + * 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA + */ + +#include +#include +#include +#include +#include +#include +#include + +static int stomp_connect(const char *hostname, int port); +static void stomp_disconnect(void); + +int stomping; +char *stompfile; +extern int verbose; + +static int fd = -1; +static int port = -1; +static int timeout = 2; /* default 2 sec to timeout JMS server ACKs */ +static char *hostname; +static char *username; +static char *passcode; +static char *topic; /* JMS "topic" for pmie messages */ +static char pmietopic[] = "PMIE"; /* default JMS "topic" for pmie */ + +static char buffer[4096]; + +static int stomp_connect(const char *hostname, int port) +{ + int sts, nodelay = 1; + struct linger nolinger = { 1, 0 }; + struct sockaddr_in myaddr; + struct hostent *servinfo; + + if ((servinfo = gethostbyname(hostname)) == NULL) + return -1; + + /* socket setup */ + if ((fd = socket(PF_INET, SOCK_STREAM, 0)) < 0) + return -2; + if (setsockopt(fd, IPPROTO_TCP, TCP_NODELAY, /* avoid 200 ms delay */ + (char *)&nodelay, (socklen_t)sizeof(nodelay)) < 0) { + stomp_disconnect(); + return -3; + } + if (setsockopt(fd, SOL_SOCKET, SO_LINGER, /* don't linger on close */ + (char *)&nolinger, (socklen_t)sizeof(nolinger)) < 0) { + stomp_disconnect(); + return -4; + } + + memset(&myaddr, 0, sizeof(myaddr)); + myaddr.sin_family = AF_INET; + memcpy(&myaddr.sin_addr, servinfo->h_addr, servinfo->h_length); + myaddr.sin_port = htons(port); + if ((sts = connect(fd, (struct sockaddr *)&myaddr, sizeof(myaddr))) < 0) { + stomp_disconnect(); + return -5; + } + + return fd; +} + +static int stomp_read_ack(void) +{ + struct timeval tv; + fd_set fds, readyfds; + int nready, sts; + + FD_ZERO(&fds); + FD_SET(fd, &fds); + tv.tv_sec = timeout; + tv.tv_usec = 0; + memcpy(&readyfds, &fds, sizeof(readyfds)); + nready = select(fd + 1, &readyfds, NULL, NULL, &tv); + if (nready <= 0) { + if (nready == 0) + __pmNotifyErr(LOG_ERR, "Timed out waiting for server %s:%d - %s", + hostname, port, strerror(errno)); + else + __pmNotifyErr(LOG_ERR, "Error waiting for server %s:%d - %s", + hostname, port, strerror(errno)); + stomp_disconnect(); + return -1; + } + + do { + sts = read(fd, buffer, sizeof(buffer)); + if (sts < 0) { + __pmNotifyErr(LOG_ERR, "Error reading from server %s:%d - %s", + hostname, port, strerror(errno)); + stomp_disconnect(); + return -1; + } + /* check for anything else we need to read to clear this ACK */ + memset(&tv, 0, sizeof(tv)); + memcpy(&readyfds, &fds, sizeof(readyfds)); + } while (select(fd + 1, &readyfds, NULL, NULL, &tv) > 0); + + return 0; +} + +static int stomp_write(const char *buffer, int length) +{ + int sts; + + do { + sts = write(fd, buffer, length); + if (sts < 0) { + __pmNotifyErr(LOG_ERR, "Write error to JMS server %s:%d - %s", + hostname, port, strerror(errno)); + stomp_disconnect(); + return -1; + } + else if (sts == 0) + break; + length -= sts; + } while (length > 0); + + return 0; +} + +static int stomp_authenticate(void) +{ + int len; + + if (fd < 0) + return -1; + len = snprintf(buffer, sizeof(buffer), + "CONNECT\nlogin:%s\npasscode:%s\n\n", username, passcode); + if (stomp_write(buffer, len) < 0) + return -1; + if (stomp_write("\0\n", 2) < 0) + return -1; + return 0; +} + +static int stomp_destination(void) +{ + int len; + + if (fd < 0) + return -1; + len = snprintf(buffer, sizeof(buffer), + "SUB\ndestination:/topic/%s\n\n", topic); + if (stomp_write(buffer, len) < 0) + return -1; + if (stomp_write("\0\n", 2) < 0) + return -1; + return 0; +} + +static int stomp_hello(void) +{ + int len; + + if (fd < 0) + return -1; + len = snprintf(buffer, sizeof(buffer), "SEND\ndestination:/topic/%s\n\n" + "INFO: PMIE: Established initial connection", topic); + if (stomp_write(buffer, len) < 0) + return -1; + if (stomp_write("\0\n", 2) < 0) + return -1; + return 0; +} + +static void stomp_disconnect(void) +{ + if (fd >= 0) + close(fd); + fd = -1; +} + +static char *isspace_terminate(char *string) +{ + int i = 0; + + while (!isspace(string[i++])) /* do nothing */ ; + if (i) + string[i-1] = '\0'; + return string; +} + +/* + * Parse our stomp configuration file, simple format: + * host= # JMS server machine + * port= # server port number + * username= | user= + * passcode= | password= + * timeout= # optional + * topic= # optional + */ +static void stomp_parse(void) +{ + char config[MAXPATHLEN+1]; + FILE *f; + + if (stompfile) + strncat(config, stompfile, sizeof(config)); + else + snprintf(config, sizeof(config), "%s/config/pmie/stomp", + pmGetConfig("PCP_VAR_DIR")); + if ((f = fopen(config, "r")) == NULL) { + __pmNotifyErr(LOG_ERR, "Cannot open STOMP configuration file %s: %s", + config, strerror(errno)); + exit(1); + } + while (fgets(buffer, sizeof(buffer), f)) { + if (strncmp(buffer, "port=", 5) == 0) + port = atoi(isspace_terminate(&buffer[5])); + else if (strncmp(buffer, "host=", 5) == 0) + hostname = strdup(isspace_terminate(&buffer[5])); + else if (strncmp(buffer, "hostname=", 9) == 0) + hostname = strdup(isspace_terminate(&buffer[9])); + else if (strncmp(buffer, "user=", 5) == 0) + username = strdup(isspace_terminate(&buffer[5])); + else if (strncmp(buffer, "username=", 9) == 0) + username = strdup(isspace_terminate(&buffer[9])); + else if (strncmp(buffer, "password=", 9) == 0) + passcode = strdup(isspace_terminate(&buffer[9])); + else if (strncmp(buffer, "passcode=", 9) == 0) + passcode = strdup(isspace_terminate(&buffer[9])); + else if (strncmp(buffer, "timeout=", 8) == 0) /* optional */ + timeout = atoi(isspace_terminate(&buffer[8])); + else if (strncmp(buffer, "topic=", 6) == 0) /* optional */ + topic = strdup(isspace_terminate(&buffer[6])); + } + fclose(f); + + if (!hostname) + __pmNotifyErr(LOG_ERR, "No host in STOMP config file %s", config); + if (port == -1) + __pmNotifyErr(LOG_ERR, "No port in STOMP config file %s", config); + if (!username) + __pmNotifyErr(LOG_ERR, "No username in STOMP config file %s", config); + if (!passcode) + __pmNotifyErr(LOG_ERR, "No passcode in STOMP config file %s", config); + if (port == -1 || !hostname || !username || !passcode) + exit(1); +} + +/* + * Setup the connection to the stomp server, and handle initial protocol + * negotiations (sending user/passcode over to the server in particular). + * Stomp protocol is clear text... (we don't need no stinkin' security!). + * Note: this routine is used for both the initial connection and also for + * any subsequent reconnect attempts. + */ +void stompInit(void) +{ + time_t thistime; + static time_t lasttime; + static int firsttime = 1; + + if (firsttime) { /* initial connection attempt */ + stomp_parse(); + if (!topic) + topic = pmietopic; + atexit(stomp_disconnect); + } else { /* reconnect attempt, if not too soon */ + time(&thistime); + if (thistime < lasttime + 60) + goto disconnect; + } + + if (verbose) + __pmNotifyErr(LOG_INFO, "Connecting to %s, port %d", hostname, port); + if (stomp_connect(hostname, port) < 0) { + __pmNotifyErr(LOG_ERR, "Could not connect to the message server"); + goto disconnect; + } + + if (verbose) + __pmNotifyErr(LOG_INFO, "Connected; sending stomp connect message"); + if (stomp_authenticate() < 0) { + __pmNotifyErr(LOG_ERR, "Could not sent STOMP CONNECT frame to server"); + goto disconnect; + } + + if (verbose) + __pmNotifyErr(LOG_INFO, "Sent; waiting for server ACK"); + if (stomp_read_ack() < 0) { + __pmNotifyErr(LOG_ERR, "Could not read STOMP ACK frame."); + goto disconnect; + } + + if (verbose) + __pmNotifyErr(LOG_INFO, "ACK; sending initial PMIE topic and hello"); + if (stomp_destination() < 0) { + __pmNotifyErr(LOG_ERR, "Could not read TOPIC frame."); + goto disconnect; + } + if (stomp_hello() < 0) { + __pmNotifyErr(LOG_ERR, "Could not send HELLO frame."); + goto disconnect; + } + + if (verbose) + __pmNotifyErr(LOG_INFO, "Sent; waiting for server ACK"); + if (stomp_read_ack() < 0) { + __pmNotifyErr(LOG_ERR, "Could not read STOMP ACK frame"); + goto disconnect; + } + + if (!firsttime) + __pmNotifyErr(LOG_INFO, "Reconnected to STOMP protocol server"); + else if (verbose) + __pmNotifyErr(LOG_INFO, "Initial STOMP protocol setup complete"); + firsttime = 0; + goto finished; + +disconnect: + stomp_disconnect(); + if (firsttime) + exit(1); + /* otherwise, we attempt reconnect on next message firing (>1min) */ +finished: + lasttime = thistime; +} + +/* + * Send a message to the stomp server. + */ +int stompSend(const char *msg) +{ + int len; + + if (fd < 0) stompInit(); /* reconnect */ + if (fd < -1) return -1; + + len = snprintf(buffer, sizeof(buffer), + "SEND\ndestination:/topic/%s\n\n", topic); + if (stomp_write(buffer, len) < 0) + return -1; + if (stomp_write(msg, strlen(msg)) < 0) + return -1; + if (stomp_write("\0\n", 2) < 0) + return -1; + return 0; +} Index: final-hacka-pcp-2.5.99/src/pmie/src/hdr.sk =================================================================== --- final-hacka-pcp-2.5.99.orig/src/pmie/src/hdr.sk 2006-11-27 10:00:42.952283750 +1100 +++ final-hacka-pcp-2.5.99/src/pmie/src/hdr.sk 2006-11-30 11:20:13.303043250 +1100 @@ -41,5 +41,6 @@ #include "pragmatics.h" #include "fun.h" #include "show.h" +#include "stomp.h" Index: final-hacka-pcp-2.5.99/man/man1/pmie.1 =================================================================== --- final-hacka-pcp-2.5.99.orig/man/man1/pmie.1 2006-11-27 10:00:42.968284750 +1100 +++ final-hacka-pcp-2.5.99/man/man1/pmie.1 2006-11-30 11:19:00.098468250 +1100 @@ -2,6 +2,7 @@ '\"macro stdmacro .\" .\" Copyright (c) 2000 Silicon Graphics, Inc. All Rights Reserved. +.\" Portions copyright (c) 2006 Aconex. All Rights Reserved. .\" .\" This program is free software; you can redistribute it and/or modify it .\" under the terms of the GNU General Public License as published by the @@ -42,6 +43,7 @@ [\f3\-c\f1 \f2filename\f1] [\f3\-h\f1 \f2host\f1] [\f3\-l\f1 \f2logfile\f1] +[\f3\-j\f1 \f2stompfile\f1] [\f3\-n\f1 \f2pmnsfile\f1] [\f3\-O\f1 \f2offset\f1] [\f3\-S\f1 \f2starttime\f1] @@ -166,6 +168,15 @@ being evaluated. Standard error is sent to .IR logfile . .TP +.B \-j +An alternative STOMP protocol configuration is loaded from +.IR stompfile . +If this option is not used, and the +.I stomp +action is used in any rule, the default location +.I $PCP_VAR_DIR/pmie/config/stomp +will be used. +.TP .B \-n An alternative Performance Metrics Name Space (PMNS) is loaded from the file .IR pmnsfile . @@ -732,10 +743,11 @@ c | c lf(CB) | l. Operators Explanation _ +alarm Raise a visible alarm with \fBxconfirm\f1(1) print Display on standard output shell Execute with \fBsh\fR(1) -alarm Raise a visible alarm with \fBxconfirm\f1(1) -syslog Append to \fI/var/adm/SYSLOG\fR +stomp Send a STOMP message to a JMS server +syslog Append a message to system log file .TE .P Multiple actions may be separated by the \f(CW&\fR and \f(CW|\fR @@ -895,6 +907,83 @@ or shutdown, or when they have been dete Refer to .BR pmie_check (1) for details on automating this process. +.SH EVENT MONITORING +It is common for production systems to be monitored in a central +location. +Traditionally on UNIX systems this has been performed by the system +log facilities \- see +.BR logger (1), +and +.BR syslogd (1). +.P +.B pmie +fits into this model when rules use the +.I syslog +action. +Note that if the action string begins with \-p (priority) and/or -t (tag) +then these are extracted from the string and treated in the same way as in +.BR logger (1). +.P +However, it is common to have other event monitoring frameworks also, +into which you may wish to incorporate performance events from +.BR pmie . +You can often use the +.I shell +action to send events to these frameworks, as they usually provide +their a program for injecting events into the framework from external +sources. +.P +A final option is use of the +.I stomp +(Streaming Text Oriented Messaging Protocol) action, which allows +.B pmie +to connect to a central JMS (Java Messaging System) server and send +events to the PMIE topic. +Tools can be written to extract these text messages and present them +to operations people (via desktop popup windows, etc). +Use of the +.I stomp +action requires a stomp configuration file to be setup, which specifies +the location of the JMS server host, port number, and username/password. +.P +The format of this file is as follows: +.P +.ft CW +.nf +.in +0.5i +host=messages.sgi.com # this is the JMS server (required) +port=61616 # and its listening here (required) +timeout=2 # seconds to wait for server (optional) +username=joe # (required) +password=j03ST0MP # (required) +topic=PMIE # JMS topic for pmie messages (optional) +.in +.fi +.ft 1 +.P +The timeout value specifies the time (in seconds) that +.B pmie +should wait for acknowledgements from the JMS server after +sending a message (as required by the STOMP protocol). +Note that on startup, +.B pmie +will wait indefinately for a connection, and will not +begin rule evaluation until that initial connection has +been established. +Should the connection to the JMS server be lost at any +time while +.B pmie +is running, +.B pmie +will attempt to reconnect on each subsequent truthful +evaluation of a rule with a +.I stomp +action, but not more than once per minute. +This is to avoid contributing to network congestion. +In this situation, where the STOMP connection to the JMS server +has been severed, the +.I stomp +action will return a non-zero error value. .SH FILES .PD 0 .TP 10 @@ -906,7 +995,7 @@ default PMNS specification files .TP .BI $PCP_TMP_DIR/pmie .B pmie -maintains files in this directory to identify the running +maintains files in this directory to identify the running .B pmie instances and to export runtime information about each instance \- this data forms the basis of the pmcd.pmie performance metrics Index: final-hacka-pcp-2.5.99/src/pmie/GNUmakefile =================================================================== --- final-hacka-pcp-2.5.99.orig/src/pmie/GNUmakefile 2006-07-17 12:53:25.000000000 +1000 +++ final-hacka-pcp-2.5.99/src/pmie/GNUmakefile 2006-11-27 11:10:49.783194250 +1100 @@ -46,6 +46,7 @@ install:: default $(INSTALL) -m 644 crontab $(CFG_DIR)/crontab $(INSTALL) -m 644 config.default.install $(CFG_DIR)/config.default $(INSTALL) -m 644 control.install $(CFG_DIR)/control + $(INSTALL) -m 644 stomp.install $(CFG_DIR)/stomp $(INSTALL) -m 755 etc_init.d_pmie $(PCP_RC_DIR)/pmie $(INSTALL) -m 755 pmie_check.sh $(PCP_BINADM_DIR)/pmie_check $(INSTALL) -m 755 pmie2col $(PCP_BIN_DIR)/pmie2col Index: final-hacka-pcp-2.5.99/src/pmie/stomp.install =================================================================== --- /dev/null 1970-01-01 00:00:00.000000000 +0000 +++ final-hacka-pcp-2.5.99/src/pmie/stomp.install 2006-11-30 11:20:59.513931250 +1100 @@ -0,0 +1,11 @@ +# +# Sample STOMP configuration file, parameters affecting connection +# between pmie and a JMS server for the "stomp" rule action. +# + +host=foo.bar.com # this is the JMS server (required) +port=61616 # and its listening here (required) +timeout=2 # seconds to wait for server (optional) +topic=PMIE # JMS topic for pmie messages (optional) +username=joe # required +password=j03ST0MP # required --=-54hwPMB8NXFNYD/KXnI2-- From nscott@aconex.com Wed Nov 29 19:21:20 2006 Received: with ECARTIS (v1.0.0; list pcp); Wed, 29 Nov 2006 19:21:26 -0800 (PST) Received: from page.mel.office.aconex.com (eth2333.vic.adsl.internode.on.net [150.101.159.28]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id kAU3LIaG030527 for ; Wed, 29 Nov 2006 19:21:19 -0800 Received: from localhost (page.mel.aconex.com [127.0.0.1]) by page.mel.office.aconex.com (Postfix) with ESMTP id 2640A534287; Thu, 30 Nov 2006 14:20:26 +1100 (EST) Received: from page.mel.office.aconex.com ([127.0.0.1]) by localhost (mail.aconex.com [127.0.0.1]) (amavisd-new, port 10024) with LMTP id 31646-01-99; Thu, 30 Nov 2006 14:20:24 +1100 (EST) Received: from edge (unknown [192.168.0.246]) by page.mel.office.aconex.com (Postfix) with ESMTP id 786DE53423F; Thu, 30 Nov 2006 14:20:24 +1100 (EST) Subject: [PATCH] fix buglet shown by gcc warnings From: Nathan Scott Reply-To: nscott@aconex.com To: Michael Newton , Mark Goodwin Cc: pcp@oss.sgi.com Content-Type: multipart/mixed; boundary="=-ds87WbwF/f9R+les/8g8" Organization: Aconex Date: Thu, 30 Nov 2006 14:19:18 +1100 Message-Id: <1164856758.4992.72.camel@edge> Mime-Version: 1.0 X-Mailer: Evolution 2.6.3 X-archive-position: 598 X-ecartis-version: Ecartis v1.0.0 Sender: pcp-bounce@oss.sgi.com Errors-to: pcp-bounce@oss.sgi.com X-original-sender: nscott@aconex.com Precedence: bulk X-list: pcp --=-ds87WbwF/f9R+les/8g8 Content-Type: text/plain Content-Transfer-Encoding: 7bit Hi, There's a bug in the libpcp_trace tag name length checks - due to the limited size of the variable being assigned, we wont notice and error out on names which are too long. This is being reported as a warning by gcc, as shown in the top of the patch attached. cheers. -- Nathan --=-ds87WbwF/f9R+les/8g8 Content-Disposition: attachment; filename=fix-trace-warnings Content-Type: text/plain; name=fix-trace-warnings; charset=UTF-8 Content-Transfer-Encoding: 8bit trace.c: In function ‘pmtracebegin’: trace.c:154: warning: comparison is always false due to limited range of data type trace.c: In function ‘pmtraceend’: trace.c:244: warning: comparison is always false due to limited range of data type trace.c: In function ‘pmtraceabort’: trace.c:307: warning: comparison is always false due to limited range of data type Index: devel-pcp-2.5.99/src/libpcp_trace/src/trace.c =================================================================== --- devel-pcp-2.5.99.orig/src/libpcp_trace/src/trace.c 2006-11-30 14:03:56.108930250 +1100 +++ devel-pcp-2.5.99/src/libpcp_trace/src/trace.c 2006-11-30 14:07:00.968483250 +1100 @@ -144,15 +144,15 @@ pmtracebegin(const char *tag) static int first = 1; _pmTraceLibdata *hptr; _pmTraceLibdata hash; - int a_sts = 0, b_sts = 0, protocol; + int len, a_sts = 0, b_sts = 0, protocol; if (tag == NULL || *tag == (char)NULL) return PMTRACE_ERR_TAGNAME; + if ((len = strlen(tag)+1) >= MAXTAGNAMELEN) + return PMTRACE_ERR_TAGLENGTH; hash.tag = (char *)tag; - hash.taglength = (unsigned int)strlen(tag)+1; - if (hash.taglength >= MAXTAGNAMELEN) - return PMTRACE_ERR_TAGLENGTH; + hash.taglength = len; hash.id = _pmtraceid(); hash.tracetype = TRACE_TYPE_TRANSACT; @@ -230,19 +230,19 @@ pmtraceend(const char *tag) _pmTraceLibdata hash; _pmTraceLibdata *hptr; struct timeval now; - int protocol, sts = 0; + int len, protocol, sts = 0; if (tag == NULL || *tag == (char)NULL) return PMTRACE_ERR_TAGNAME; + if ((len = strlen(tag)+1) >= MAXTAGNAMELEN) + return PMTRACE_ERR_TAGLENGTH; if (gettimeofday(&now, NULL) < 0) return -TRACE_ERRNO; /* give just enough info for comparison routine */ hash.tag = (char *)tag; - hash.taglength = (unsigned int)strlen(tag)+1; - if (hash.taglength >= MAXTAGNAMELEN) - return PMTRACE_ERR_TAGLENGTH; + hash.taglength = len; hash.id = _pmtraceid(); hash.tracetype = TRACE_TYPE_TRANSACT; @@ -297,15 +297,15 @@ pmtraceabort(const char *tag) { _pmTraceLibdata hash; _pmTraceLibdata *hptr; - int sts = 0; + int len, sts = 0; if (tag == NULL || *tag == (char)NULL) return PMTRACE_ERR_TAGNAME; + if ((len = strlen(tag)+1) >= MAXTAGNAMELEN) + return PMTRACE_ERR_TAGLENGTH; hash.tag = (char *)tag; - hash.taglength = (unsigned int)strlen(tag)+1; - if (hash.taglength >= MAXTAGNAMELEN) - return PMTRACE_ERR_TAGLENGTH; + hash.taglength = len; hash.id = _pmtraceid(); hash.tracetype = TRACE_TYPE_TRANSACT; --=-ds87WbwF/f9R+les/8g8-- From nscott@aconex.com Wed Nov 29 19:37:03 2006 Received: with ECARTIS (v1.0.0; list pcp); Wed, 29 Nov 2006 19:37:10 -0800 (PST) Received: from page.mel.office.aconex.com (eth2333.vic.adsl.internode.on.net [150.101.159.28]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id kAU3b1aG032211 for ; Wed, 29 Nov 2006 19:37:02 -0800 Received: from localhost (page.mel.aconex.com [127.0.0.1]) by page.mel.office.aconex.com (Postfix) with ESMTP id E1BF4534273; Thu, 30 Nov 2006 14:36:09 +1100 (EST) Received: from page.mel.office.aconex.com ([127.0.0.1]) by localhost (mail.aconex.com [127.0.0.1]) (amavisd-new, port 10024) with LMTP id 08934-01-4; Thu, 30 Nov 2006 14:36:09 +1100 (EST) Received: from edge (unknown [192.168.0.246]) by page.mel.office.aconex.com (Postfix) with ESMTP id 29A1B5341BC; Thu, 30 Nov 2006 14:36:09 +1100 (EST) Subject: [PATCH] fix another compiler warning From: Nathan Scott Reply-To: nscott@aconex.com To: Michael Newton , Mark Goodwin Cc: pcp@oss.sgi.com Content-Type: multipart/mixed; boundary="=-SGjMq8zC+wn1ByxOupxp" Organization: Aconex Date: Thu, 30 Nov 2006 14:35:03 +1100 Message-Id: <1164857703.4992.76.camel@edge> Mime-Version: 1.0 X-Mailer: Evolution 2.6.3 X-archive-position: 599 X-ecartis-version: Ecartis v1.0.0 Sender: pcp-bounce@oss.sgi.com Errors-to: pcp-bounce@oss.sgi.com X-original-sender: nscott@aconex.com Precedence: bulk X-list: pcp --=-SGjMq8zC+wn1ByxOupxp Content-Type: text/plain Content-Transfer-Encoding: 7bit This unused variable warning is benign, but may as well clean it up... cheers. -- Nathan --=-SGjMq8zC+wn1ByxOupxp Content-Disposition: attachment; filename=fix-proc-pid-unused-warning Content-Type: text/plain; name=fix-proc-pid-unused-warning; charset=UTF-8 Content-Transfer-Encoding: 8bit proc_pid.c: In function ‘refresh_proc_pid’: proc_pid.c:102: warning: unused variable ‘n’ Index: devel-pcp-2.5.99/src/pmdas/linux/proc_pid.c =================================================================== --- devel-pcp-2.5.99.orig/src/pmdas/linux/proc_pid.c 2006-11-30 14:28:34.781341500 +1100 +++ devel-pcp-2.5.99/src/pmdas/linux/proc_pid.c 2006-11-30 14:28:40.369690750 +1100 @@ -99,7 +99,7 @@ refresh_pidlist() int refresh_proc_pid(proc_pid_t *proc_pid) { - int i, n; + int i; int fd; char *p; char buf[1024]; --=-SGjMq8zC+wn1ByxOupxp--