From owner-pcp@oss.sgi.com Fri Oct 13 00:14:00 2000 Received: by oss.sgi.com id ; Fri, 13 Oct 2000 00:13:51 -0700 Received: from tah14.ctt.cz ([194.108.115.182]:33031 "EHLO arthur.plbohnice.cz") by oss.sgi.com with ESMTP id ; Fri, 13 Oct 2000 00:13:50 -0700 Received: (from lemming@localhost) by arthur.plbohnice.cz (8.9.3/8.10.1) id JAA00572 for pcp@oss.sgi.com; Fri, 13 Oct 2000 09:14:09 +0200 Date: Fri, 13 Oct 2000 09:14:09 +0200 From: lemming@arthur.plbohnice.cz To: pcp@oss.sgi.com Subject: Patch to Linux PMDA Message-ID: <20001013091409.A523@arthur.plbohnice.cz> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="AhhlLboLdkugWU4S" X-Mailer: Mutt 1.0.1i Sender: owner-pcp@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;pcp-outgoing --AhhlLboLdkugWU4S Content-Type: text/plain; charset=us-ascii Hello! I have made patch to Linux PMDA. It takes data from /proc/tcp and creates network.tcpconn subtree containing values: network.tcpconn { established 60:16:1 syn_sent 60:16:2 syn_recv 60:16:3 fin_wait1 60:16:4 fin_wait2 60:16:5 time_wait 60:16:6 close 60:16:7 close_wait 60:16:8 last_ack 60:16:9 listen 60:16:10 closing 60:16:11 } which hold number of connections in each state. I did this because we here need to archive data on how many connections are there to/from our server and in which state they are. If you consider it worthy, please apply it to standard distribution. Michal Kara --AhhlLboLdkugWU4S Content-Type: application/x-gunzip Content-Disposition: attachment; filename="linux.diff.gz" Content-Transfer-Encoding: base64 H4sICI+05jkAA2xpbnV4LmRpZmYAxVhbb9s2FH6WfwWRAYUTy44lx7LjYEAyR12N2nJmOxuK rhBUmY6ESJSmS9Kg6H/fIam71EZ9mGcHMXluOt9Hijzk3j4cUD8eI8cm8ZeBF9gP5yvjER9s B3NZ1u30+/0mM2FnxWhtRkiSkTSeSdPZaIrk4XDY6fV6lRgV29HlTB5x2+tr1B+NRGmCevzn +rqDOmj+drFUt4LwK/LdvTEwkR94ph5GRpS2Xeza5OClXccz9sbTA3T/7iBBQFxKcKTv8VNq ZJMIB0HsRyFIaF7hS5h7hM+GXzCmvoFvpl3f3mdNI4jsyPZIwTmKXojh4qJz6JmPxYyDmPzD HPql/ELi+hlAM7QHZqf3ikGui0xQUcLeZYTtPdewycAqUGaVKbPKlFmNlFk1yqyMMivjqu5L KbNyyqwaZXWXjCirQJRVIcAqEcd4sApEcUGZGIsSs9xu+GRCwI2FHR8FnhfpbH7yph3YX5jl 8na5+G0LZn3HN32dzjzeBO2+4YVh0fhEp83qi0JlbV6Smp0yk6SZrOQvyMX0UhyjHv2RpuwN uUYA8tkLHgcUqBdHQRiFyPRiEiHvgECIEmHRNt77A5vsjch4CAy3YA8KVFBUfYjne0FUsU+E 9fgwX7ygFpxLq9aQ5HfSKWo6vU6vhNj0CBlgmDOfHTu08B5psfsZB9S3KKVm2GTTrilA+EL0 EMMjc+/tB03fqtqujWuAzaeK60ad//ma68Em+rNhR1LB9+1C0/+6Weykts5yg7P8mnNku5h5 F5x3i5XKvF9zNh0vxAXH+XK9VVs5VR/JPFs90zHCSDfMx4Lz8ma702/m7191tcMIk6LjYrtT tTYJ2+Shku1C+73s2LgcJDsVf6l5p7okcGmbRaHBEpaFy9mwuCxMRGkIywL8SGxV+MUmphPv MTqprd0nRW22jpek2Zp+UouULezNqnwBP+n0GvRsMa67sv2hQVzYK+pavjl8Jw+2UdR1fHM4 YaQpEiNNkVPSaOK2Wdr89EgQiv2rslXOKxhWJLlpQjKNlTRzXUI11SXNq3oiQHgpEehfdXpV K+C2ZAX9hljpCKUJF2UVcxiVLCC0q9pscNJQuaSALwpimLVxFNKiSBAecUCwo8e0VwlIxzN7 Hu1csWGSJmMRJn1PmsDsHvHZvYdakmB4I+/hVd7omrrTt+v5++3uZifAYJ6foXMa5RzAnafg 0Nl53fO9utHUpX6v3axUAZ4Cniy17imCYYqwi0zDcRpd7zbrub651/4QICvwI9lCQR+NwxCH sNmhJyOwvThkSEHSFImlr63uBOmCBsozh0nc6LCdbxeCII0zazqt2T9q3muKv5tDeKXMDK0N aPhOP/XQ7ld64gUVJLf3QO56AUaRZRBk2Q8WbKwJWrqzOjHwFEAglD+5Emfyk3Hge37WQegM UfJdHAW2CQzGPq0zwBZojSw7RHer2xvUZ5HpVo8RbOHBCzp4AcKGadEIFB6bReOxzF53aIxF RUlOF5CXZZOngWv4g5Q/RD9fAcNyKSLeo336MP1utbjtFodBHJ6KoNJ3H+5UfbvbwAYhIqrQ F9rtesV0W3UFPZic2i4PmIS71xa7bXcoZt9T9E2EP1rQCp1eMsCstivWM3SYaVJvSisrnWMf dZrN/E5X4YG/wV73Tr39JFLrZhDJ5BCRVAByP5JZh2HQORMVIDxiDYXEUIDtfH2v7fS1pgIg hqcMJqu4WiBJS7FWMOSjw2DVX0sYtCxsBWN0VBh5JdoCR1aitgJy8b8AkX8GiNwKyPioQPIK vQWQrHRvBUQ5KhB+WmgBgh0GWgGYHB9A66HIjzStoEyPCiU7RbUAkh6vWsG4PC4MfqJrA4Id 9dptf8Ojzyp6vGw5paCsaIfiP93FAQT6dkXrpqRuh9NOhx9TYRc8BDi0ujbd0wnG+1Ry2kFf eQmmDKe8BFMUKOUvWAlGP/YBdYseH6uF8SeIISRKvXTG674pdU9Zcj+MSWuzejxa+yWxaBPi 9Dg1P8wNKIdQYFfLDQaxWxpSFhEYbOTOJiAjJu767oLcei69gfNckZogqhORaRlQGtOTiYh0 nZptcBg7QPVZwBoiux5Xv4CENhjpa+K80FqZPWRWvSr+kdZqvNYoXJfyq4lcUL3eyDVtrji+ Y63M5OFMlvJrDmksJTW8lJ7Z6Sc94SUj7ye/ppu0IjNpxHuf1tXJFU8yHOnFDzu5HwwTA3PM 2vQcWGnocVYQlOFsNJNGSWTCL13kCbuLlaeiPMyTgVO8j/fMRRrNLrgUzkUxMaNUOk7yIY/E eyapVOFSfkROhZMOe+96lfspSJLNz+KhgDooM4krsgKbS+VcyupVLh1xaV7+cfFFWSwn4jEX 56UJFytczDd6LpoUREXTKZdn+xCXXiZSvqwnKIZ5CLpOJlIAR9n4FzB3UVG1GgAA --AhhlLboLdkugWU4S-- From owner-pcp@oss.sgi.com Fri Oct 13 00:17:20 2000 Received: by oss.sgi.com id ; Fri, 13 Oct 2000 00:17:10 -0700 Received: from tah14.ctt.cz ([194.108.115.182]:41735 "EHLO arthur.plbohnice.cz") by oss.sgi.com with ESMTP id ; Fri, 13 Oct 2000 00:17:06 -0700 Received: (from lemming@localhost) by arthur.plbohnice.cz (8.9.3/8.10.1) id JAA00620 for pcp@oss.sgi.com; Fri, 13 Oct 2000 09:17:30 +0200 Date: Fri, 13 Oct 2000 09:17:30 +0200 From: lemming@arthur.plbohnice.cz To: pcp@oss.sgi.com Subject: Linux PMDA - missing files Message-ID: <20001013091730.A601@arthur.plbohnice.cz> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="OXfL5xGRrasGEqWY" X-Mailer: Mutt 1.0.1i Sender: owner-pcp@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;pcp-outgoing --OXfL5xGRrasGEqWY Content-Type: text/plain; charset=us-ascii Sorry, the patch didn't create new files I added :| I am sending them with this letter. They should be placed into src/pmdas/linux. Sorry. Michal Kara --OXfL5xGRrasGEqWY Content-Type: text/plain Content-Disposition: attachment; filename="proc_net_tcp.c" /* * Copyright (C) 1999 Silicon Graphics, Inc. All Rights Reserved. * * This program is free software; you can redistribute it and/or modify it * under the terms of version 2 of the GNU General Public License as published * by the Free Software Fondation. * * This program is distributed in the hope that it would be useful, but WITHOUT * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or * FITNESS FOR A PARTICULAR PURPOSE. Further, any license provided herein, * whether implied or otherwise, is limited to this program in accordance with * the express provisions of the GNU General Public License. Patent licenses, * if any, provided herein do not apply to combinations of this program with * other product or programs, or any other product whatsoever. This program is * distributed without any warranty that the program is delivered free of the * rightful claim of any third person by way of infringement or the like. See * the GNU General Public License for more details. * * You should have received a copy of the GNU General Public License along with * this program; if not, write the Free Software Foundation, Inc., 59 Temple * Place - Suite 330, Boston MA 02111-1307, USA. */ #ident "$Id: proc_net_sockstat.c,v 1.0" #include #include #include "proc_net_tcp.h" static int started = 0; int refresh_proc_net_tcp(proc_net_tcp_t *proc_net_tcp) { char buf[1024]; char fmt[64]; char *s; FILE *fp; int n; memset(proc_net_tcp, 0, sizeof(*proc_net_tcp)); if ((fp = fopen("/proc/net/tcp", "r")) == NULL) { return -errno; } while (fgets(buf, sizeof(buf)-1, fp) != NULL) { fprintf(stderr,"Buf='%s'\n",buf); if (!buf[0]) break; buf[sizeof(buf)-1] = 0; // Find colon s = buf; while(*s && (*s != ':')) s++; if (*s) { // Skip three spaces n = 3; while(*s && n) { if (*s == ' ') n--; s++; } if (*s) { // Get state n = 0; for(;;) { if (isalpha(*s)) n = (n<<4) + (toupper(*s)-'A'+10); else if (isdigit(*s)) n = (n<<4) + (*s-'0'); else break; s++; } if (n < _PM_TCP_LAST) proc_net_tcp->stat[n]++; } } } fclose(fp); /* success */ return 0; } --OXfL5xGRrasGEqWY Content-Type: text/plain Content-Disposition: attachment; filename="proc_net_tcp.h" /* * * Copyright (C) 1999 Silicon Graphics, Inc. All Rights Reserved. * * This program is free software; you can redistribute it and/or modify it * under the terms of version 2 of the GNU General Public License as published * by the Free Software Fondation. * * This program is distributed in the hope that it would be useful, but WITHOUT * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or * FITNESS FOR A PARTICULAR PURPOSE. Further, any license provided herein, * whether implied or otherwise, is limited to this program in accordance with * the express provisions of the GNU General Public License. Patent licenses, * if any, provided herein do not apply to combinations of this program with * other product or programs, or any other product whatsoever. This program is * distributed without any warranty that the program is delivered free of the * rightful claim of any third person by way of infringement or the like. See * the GNU General Public License for more details. * * You should have received a copy of the GNU General Public License along with * this program; if not, write the Free Software Foundation, Inc., 59 Temple * Place - Suite 330, Boston MA 02111-1307, USA. */ #ident "$Id: proc_net_tcp.h,v 1.0$" enum { _PM_TCP_ESTABLISHED = 1, _PM_TCP_SYN_SENT, _PM_TCP_SYN_RECV, _PM_TCP_FIN_WAIT1, _PM_TCP_FIN_WAIT2, _PM_TCP_TIME_WAIT, _PM_TCP_CLOSE, _PM_TCP_CLOSE_WAIT, _PM_TCP_LAST_ACK, _PM_TCP_LISTEN, _PM_TCP_CLOSING, _PM_TCP_LAST }; typedef struct { int stat[_PM_TCP_LAST]; } proc_net_tcp_t; extern int refresh_proc_net_tcp(proc_net_tcp_t *); --OXfL5xGRrasGEqWY-- From owner-pcp@oss.sgi.com Sun Oct 15 18:32:33 2000 Received: by oss.sgi.com id ; Sun, 15 Oct 2000 18:32:24 -0700 Received: from deliverator.sgi.com ([204.94.214.10]:7284 "EHLO deliverator.sgi.com") by oss.sgi.com with ESMTP id ; Sun, 15 Oct 2000 18:32:04 -0700 Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by deliverator.sgi.com (980309.SGI.8.8.8-aspam-6.2/980310.SGI-aspam) via SMTP id SAA18274 for ; Sun, 15 Oct 2000 18:24:17 -0700 (PDT) mail_from (markgw@sgi.com) Received: from sandpit.melbourne.sgi.com (sandpit.melbourne.sgi.com [134.14.55.132]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id MAA15842; Mon, 16 Oct 2000 12:30:38 +1100 Date: Mon, 16 Oct 2000 12:28:47 +1100 (EST) From: Mark Goodwin X-Sender: markgw@sandpit.melbourne.sgi.com To: lemming@arthur.plbohnice.cz cc: pcp@oss.sgi.com Subject: Re: Patch to Linux PMDA In-Reply-To: <20001013091409.A523@arthur.plbohnice.cz> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-pcp@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;pcp-outgoing On Fri, 13 Oct 2000 lemming@arthur.plbohnice.cz wrote: > Hello! > > I have made patch to Linux PMDA. It takes data from /proc/tcp and creates > network.tcpconn subtree containing values: > ... > which hold number of connections in each state. I did this because we here > need to archive data on how many connections are there to/from our server and in > which state they are. If you consider it worthy, please apply it to standard > distribution. > > Michal Kara Michal, thanks for your contribution! It's certainly "worth it", especially for monitoring servers with a high tcp connection rate (e.g. http servers). A couple of points: 1. I changed your code slightly; removed a debug printf that was filling up pmcd.log, fixed some indentation to match our sw=4 ts=8 style and changed your choice of cluster number to 19 (clusters 16, 17 and 18 are already in use by the new xfs, pagebuf and hinv.cpuinfo metric clusters). 2. the code you sent me had "Copyright SGI" in the header comments. Since it was obvious you based your code on proc_net_sockstat.[ch], I assume it's OK with you to leave those copyrights intact. I did however add a comment in proc_net_tcpconn.[ch] that the code was contributed by you. Is that OK? 3. here at SGI we should endeavor to sync up on oss.sgi.com more often (there is potential for duplicated effort that we obviously want to avoid) As soon as I get an answer to (2) above, I'll upload to oss.sgi.com. The new revision will be pcp-2.1.10. Below is the changelog relative to the current version on oss.sgi.com (2.1.7-2). thanks! -- Mark Goodwin --- CHANGELOG since pcp-2.1.7-2 --- pcp-2.1.8-2 (released 30 June 2000) - fix for bug #793871 pmlogger_check fails after redhat upgrade (because PCP entries in /usr/share/magic were clobbered) - also install /var/pcp/pmdas/linux/pmdalinux (as a non-DSO agent for debugging and profiling purposes). - added pmda.uname (uname -a) and pmda.version (linux pmda version) metrics. The pmda.uname metric is needed by the "pcp" command. - fix for #789025 fix to ensure rpm --verify succeeds immediately after an install, and other errors in pmlogger_check - released with ACE 1.3 (MR 19 Jul 2000) pcp-2.1.9-6 (released 2 Aug 2000 for propack1.4 - alpha, not final) - install /usr/share/pcp/lib/rc-proc.sh containing common shell functions for use by rc scripts - these functions are tolerant of the chkconfig command missing (as in SUSE). - update all rc scripts and {pmlogger,pmie}_{check,daily} scripts to use the new rc-proc.sh functions. Remove the /etc/sysconfig stuff entirely (it was not being used anyway). - fix for #795934 : after rpm -U, pcp is chkconfig off. It turned out that an upgrade executes the %post _and_ the %preun scripts, which resulted in pcp being chkconfig'd on then off again. - fix pmie rc scripts so they work, are chkconfig friendly, and cope with _and_ without pmieconf (which is in pcp-pro). Also install /var/pcp/config/pmie/config.default as a simple example to monitor the load average and report to syslog. The pmie daemon is chkconfig off by default. - default run levels for pmcd and pmie (daemon) are now 2345, for SUSE - reconcile troff and groff differences in man page sources - fix for bug #797049 use strftime(%z) to determine timezone offsets w.r.t. daylight savings - portability surgery on src/libpcp_trace, and add new pmtracecounter() function, see pmtracebegin(3) for details. - reconcile pcp.env and pmcd.options from IRIX - fix for bug #797048 update-magic does not fully remove old entries before adding new, hence the magic file would grow after each upgrade - other minor reconciliation work with IRIX - fix build environment to allow proper handling of compressed man pages - add support for RPM version 4. - add support for add kernel.{all,percpu}.syscall metrics (requires kernel patch) - fixed for bug #797164: potential SEGV due to calling realloc on a misused pointer - src/pmdas/weblog/weblog.c - use realpath(3) to resolve devices in /proc/mounts for filesys.* metrics pcp-2.1.9-11 (unreleased) - add pagebuf metrics (Daniel and Nathan) - fixes so the build works if pcp is not already installed - minor security fix to pcp.spec.in (force mode 644 for .NeedRebuild) - make sure the src RPM builds correctly (LSRCFILE issues from LinuxWorld) - fix for bug #797756, upgrade from pcp2.1.6 to any newer version leaves pcp chkconfig off and the name space does not get rebuilt. - extended the weblogs PCP agent so it can report proxy/squid http servers, and added assorted http cache statistics. - fixed the Cisco router PCP agent (it was broken in pp1.3). - add support for disk stats in 2.4.x kernels with "disk_io" field in /proc/stat (only used when sard patch is not installed) - if the pcp-pro package (SGI proprietary) is installed, all libpcp clients on linux are now "authorized" to monitor IRIX systems that do not have a pmcd collector license. pcp-2.1.9-12 (released circa Sept 13 2000, with SGI Propack1.4) - for 2.4 without sard, correctly match disk numbers in /proc/stat with major,minor numbers in /proc/partitions. pcp-2.1.10-5 (new one, not released yet) - guard against DOS attack by restring incoming PDU size to 64K. - add hinv.map.cpu and hinv.cpu metrics exported by /proc/cpuinfo - fix small error in INSTALL_MAN rule in src/include/builddefs.in - fix for bug #793427 - correct symlinks for man pages with multiple entries in the .SH NAME section. - add network.tcpconn metrics to export counts of tcp connections in each state. Code contributed by Michal Kara (lemming@arthur.plbohnice.cz) From owner-pcp@oss.sgi.com Thu Oct 19 20:08:48 2000 Received: by oss.sgi.com id ; Thu, 19 Oct 2000 20:08:38 -0700 Received: from pneumatic-tube.sgi.com ([204.94.214.22]:49767 "EHLO pneumatic-tube.sgi.com") by oss.sgi.com with ESMTP id ; Thu, 19 Oct 2000 20:08:24 -0700 Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by pneumatic-tube.sgi.com (980327.SGI.8.8.8-aspam/980310.SGI-aspam) via SMTP id UAA05783 for ; Thu, 19 Oct 2000 20:15:42 -0700 (PDT) mail_from (markgw@sgi.com) Received: from sandpit.melbourne.sgi.com (sandpit.melbourne.sgi.com [134.14.55.132]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id OAA22707; Fri, 20 Oct 2000 14:07:03 +1100 Date: Fri, 20 Oct 2000 14:04:27 +1100 (EST) From: Mark Goodwin X-Sender: markgw@sandpit.melbourne.sgi.com To: pcp@oss.sgi.com, linux-perf@vger.kernel.org, beowulf@webserv.gsfc.nasa.gov cc: sgi.engr.pcp@cthulhu.sgi.com Subject: [ANNOUNCE] SGI Performance Co-Pilot 2.1.10 now available Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-pcp@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;pcp-outgoing Content-Length: 2678 Lines: 59 SGI is pleased to announce the new version of Performance Co-Pilot (PCP) open source (version 2.1.10-8) is now available for download from http://oss.sgi.com/projects/pcp/download PCP is an extensible system monitoring package with a client/server architecture. It provides a distributed unifying abstraction for all interesting performance statistics in /proc and assorted applications (e.g. Apache). The PCP library APIs are robust and well documented, supporting rapid deployment of new and diverse sources of performance data and the development of sophisticated performance monitoring tools. There are binary RPMS for ia32 and ia64, the source RPM and tar.gz files. The source is also known build and work for Linux-ppc and Linux-alpha. The PCP homepage is at http://oss.sgi.com/projects/pcp and you can join the PCP mailing list via http://oss.sgi.com/projects/pcp/mail.html Changes since the last public release (2.1.7) include :- new metrics: hinv.cpu.* exporting /proc/cpuinfo new metrics: network.tcpconn.* exporting /proc/net/tcp (thanks Michal Kala) new metrics: web log squid and proxy stats new metrics: kernel.{all,percpu}.syscall (requires kernel patch available on request) new metrics: pagebuf and xfs (both require 2.4 kernel and XFS) minor fixes for ia64 build fixes for RPM upgrade problems reworked rc scripts for better SUSE support fixes for pmie scripts ("Inference Engine") in daemon mode new pmdatrace(3) functions fix for magic file handling proper handling of compressed man pages fix build environment to support RPM v4 fixes so the build works if pcp is not already installed fixed the Cisco router PCP agent fix disk stats in 2.4.x kernels with "disk_io" field in /proc/stat authorization to monitor unlicensed IRIX systems if "pcp-pro" is installed security: guard against DOS attack by restricting incoming PDU size to 64K assorted other minor bug fixes and reconciliation with PCP in IRIX To use the new XFS and pagebuf metrics, obviously you need a kernel that supports XFS - see http://oss.sgi.com/projects/xfs/cvs_download.html or join the XFS mailing list via http://oss.sgi.com/projects/xfs/mail.html SGI would like to thank those who have contributed to PCP. In particular, we would like to thank Michal Kala for contributing new metrics to the linux PMDA, new monitoring tools ("PCP_MON") and several new PMDAs. SGI would be delighted to hear from anyone wanting to contribute to the PCP project (especially new monitoring tools), and will provide technical assistance getting your project off the ground. thanks -- Mark Goodwin SGI Engineering From owner-pcp@oss.sgi.com Fri Oct 20 01:02:09 2000 Received: by oss.sgi.com id ; Fri, 20 Oct 2000 01:01:59 -0700 Received: from tah14.ctt.cz ([194.108.115.182]:9997 "EHLO arthur.plbohnice.cz") by oss.sgi.com with ESMTP id ; Fri, 20 Oct 2000 01:01:42 -0700 Received: (from lemming@localhost) by arthur.plbohnice.cz (8.9.3/8.10.1) id KAA07868 for pcp@oss.sgi.com; Fri, 20 Oct 2000 10:01:19 +0200 Date: Fri, 20 Oct 2000 10:01:19 +0200 From: lemming@arthur.plbohnice.cz To: pcp@oss.sgi.com Subject: Running PCP as non-root Message-ID: <20001020100119.A7835@arthur.plbohnice.cz> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Mailer: Mutt 1.0.1i Sender: owner-pcp@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;pcp-outgoing Content-Length: 327 Lines: 9 Hello! I would like to ask why the PCP script checks whether it it is ran by root. I need to run PMCD as non-root user (possibly in chrooted environment) for security reasons. I have commented out this check and PMCD seemed to work OK. So is there any reason why it must be ran as root? Thanks, Michal Kara From owner-pcp@oss.sgi.com Sat Oct 21 20:33:19 2000 Received: by oss.sgi.com id ; Sat, 21 Oct 2000 20:32:59 -0700 Received: from deliverator.sgi.com ([204.94.214.10]:45060 "EHLO deliverator.sgi.com") by oss.sgi.com with ESMTP id ; Sat, 21 Oct 2000 20:32:43 -0700 Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by deliverator.sgi.com (980309.SGI.8.8.8-aspam-6.2/980310.SGI-aspam) via SMTP id UAA24124 for ; Sat, 21 Oct 2000 20:24:54 -0700 (PDT) mail_from (nathans@wobbly.melbourne.sgi.com) Received: from wobbly.melbourne.sgi.com (wobbly.melbourne.sgi.com [134.14.55.135]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id OAA04209; Sun, 22 Oct 2000 14:31:25 +1100 Received: (from nathans@localhost) by wobbly.melbourne.sgi.com (980427.SGI.8.8.8/980728.SGI.AUTOCF) id OAA83927; Sun, 22 Oct 2000 14:31:23 +1100 (EDT) From: "Nathan Scott" Message-Id: <10010221431.ZM84156@wobbly.melbourne.sgi.com> Date: Sun, 22 Oct 2000 14:31:21 -0400 In-Reply-To: lemming@arthur.plbohnice.cz "Running PCP as non-root" (Oct 20, 10:01am) References: <20001020100119.A7835@arthur.plbohnice.cz> X-Mailer: Z-Mail (3.2.3 08feb96 MediaMail) To: lemming@arthur.plbohnice.cz, pcp@oss.sgi.com Subject: Re: Running PCP as non-root Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Sender: owner-pcp@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;pcp-outgoing Content-Length: 836 Lines: 27 hi, On Oct 20, 10:01am, lemming@arthur.plbohnice.cz wrote: > Subject: Running PCP as non-root > Hello! > > I would like to ask why the PCP script checks whether it it is ran by root. I > need to run PMCD as non-root user (possibly in chrooted environment) for security > reasons. I have commented out this check and PMCD seemed to work OK. So is there > any reason why it must be ran as root? > I think the reasons are mainly historical - on IRIX, the libirixpmda PMDA needs to be root to be able to make some of its system calls (and a couple of /dev/kmem reads), so pmcd needs to be root. pmcd also needs to write stuff into its log file, which is in a directory where only root can write (by default). I _think_ those are the only reasons... thats all I can remember off the top of my head, anyway. cheers. -- Nathan From owner-pcp@oss.sgi.com Sun Oct 22 18:20:52 2000 Received: by oss.sgi.com id ; Sun, 22 Oct 2000 18:20:32 -0700 Received: from pneumatic-tube.sgi.com ([204.94.214.22]:31516 "EHLO pneumatic-tube.sgi.com") by oss.sgi.com with ESMTP id ; Sun, 22 Oct 2000 18:20:17 -0700 Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by pneumatic-tube.sgi.com (980327.SGI.8.8.8-aspam/980310.SGI-aspam) via SMTP id SAA02243 for ; Sun, 22 Oct 2000 18:27:37 -0700 (PDT) mail_from (max@kuku.melbourne.sgi.com) Received: from kuku.melbourne.sgi.com (kuku.melbourne.sgi.com [134.14.55.163]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id MAA08931; Mon, 23 Oct 2000 12:18:55 +1100 Received: (from max@localhost) by kuku.melbourne.sgi.com (SGI-8.9.3/8.9.3) id MAA27674; Mon, 23 Oct 2000 12:18:54 +1100 (EST) MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Message-ID: <14835.37374.556589.216686@kuku.melbourne.sgi.com> Date: Mon, 23 Oct 2000 12:18:54 +1100 (EST) From: Max Matveev To: lemming@arthur.plbohnice.cz, pcp@oss.sgi.com Subject: Re: Running PCP as non-root In-Reply-To: <10010221431.ZM84156@wobbly.melbourne.sgi.com> References: <20001020100119.A7835@arthur.plbohnice.cz> <10010221431.ZM84156@wobbly.melbourne.sgi.com> X-Mailer: VM 6.72 under 21.1 (patch 10) "Capitol Reef" XEmacs Lucid Sender: owner-pcp@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;pcp-outgoing Content-Length: 1200 Lines: 26 >>>>> "NS" == Nathan Scott writes: NS> I think the reasons are mainly historical - on IRIX, the NS> libirixpmda PMDA needs to be root to be able to make some NS> of its system calls (and a couple of /dev/kmem reads), so NS> pmcd needs to be root. NS> pmcd also needs to write stuff into its log file, which is NS> in a directory where only root can write (by default). NS> I _think_ those are the only reasons... thats all I can NS> remember off the top of my head, anyway. Nathan is right about been root on Irix - on linux I've tried recently to run pcp as non-priveleged user and the only problem I found was the permissions on the log file, the rest just worked. As far as changing the model, I don't see the reason (other then paranoia) to be non-priveleged because it will mean we would have to introduce a concept of "pcp" user (remeber, init scripts are all started by root and unless we specifically change uid, we're not going to get any advantage here). It will also mean that should in the future we'd have to make some kind of fancy ioctl-ing, it may not work from the non-priveleged user and Linux doesn't have capabilities yet. Or does it? max From owner-pcp@oss.sgi.com Sun Oct 22 18:33:02 2000 Received: by oss.sgi.com id ; Sun, 22 Oct 2000 18:32:52 -0700 Received: from deliverator.sgi.com ([204.94.214.10]:33303 "EHLO deliverator.sgi.com") by oss.sgi.com with ESMTP id ; Sun, 22 Oct 2000 18:32:42 -0700 Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by deliverator.sgi.com (980309.SGI.8.8.8-aspam-6.2/980310.SGI-aspam) via SMTP id SAA15491 for ; Sun, 22 Oct 2000 18:24:53 -0700 (PDT) mail_from (nathans@wobbly.melbourne.sgi.com) Received: from wobbly.melbourne.sgi.com (wobbly.melbourne.sgi.com [134.14.55.135]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id MAA09015 for <@larry.melbourne.sgi.com:pcp@oss.sgi.com>; Mon, 23 Oct 2000 12:31:11 +1100 Received: (from nathans@localhost) by wobbly.melbourne.sgi.com (980427.SGI.8.8.8/980728.SGI.AUTOCF) id MAA84448 for pcp@oss.sgi.com; Mon, 23 Oct 2000 12:31:10 +1100 (EDT) From: "Nathan Scott" Message-Id: <10010231231.ZM83885@wobbly.melbourne.sgi.com> Date: Mon, 23 Oct 2000 12:31:09 -0400 In-Reply-To: Max Matveev "Re: Running PCP as non-root" (Oct 23, 12:18pm) References: <20001020100119.A7835@arthur.plbohnice.cz> <10010221431.ZM84156@wobbly.melbourne.sgi.com> <14835.37374.556589.216686@kuku.melbourne.sgi.com> X-Mailer: Z-Mail (3.2.3 08feb96 MediaMail) To: pcp@oss.sgi.com Subject: Re: Running PCP as non-root Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Sender: owner-pcp@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;pcp-outgoing Content-Length: 937 Lines: 28 hi, On Oct 23, 12:18pm, Max Matveev wrote: > Subject: Re: Running PCP as non-root > ... > As far as changing the model, I don't see the reason (other then > paranoia) to be non-priveleged because it will mean we would have to > introduce a concept of "pcp" user (remeber, init scripts are all > started by root and unless we specifically change uid, we're not going > to get any advantage here). could we just become user "nobody" rather than creating a new "pcp" user? (and if that failed fall back to root?) > ... It will also mean that should in the > future we'd have to make some kind of fancy ioctl-ing, it may not work > from the non-priveleged user and Linux doesn't have capabilities > yet. Or does it? > that could be done as a separate (setuid) pmda if the need arose? - hopefully it wont - and ioctl doesn't always require root... (just need to be able to open the file passed in thru ioctl arg1). cheers. -- Nathan From owner-pcp@oss.sgi.com Sun Oct 22 21:02:04 2000 Received: by oss.sgi.com id ; Sun, 22 Oct 2000 21:01:55 -0700 Received: from deliverator.sgi.com ([204.94.214.10]:4650 "EHLO deliverator.sgi.com") by oss.sgi.com with ESMTP id ; Sun, 22 Oct 2000 21:01:28 -0700 Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by deliverator.sgi.com (980309.SGI.8.8.8-aspam-6.2/980310.SGI-aspam) via SMTP id UAA25900 for ; Sun, 22 Oct 2000 20:53:39 -0700 (PDT) mail_from (markgw@sgi.com) Received: from sandpit.melbourne.sgi.com (sandpit.melbourne.sgi.com [134.14.55.132]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id OAA09712; Mon, 23 Oct 2000 14:58:54 +1100 Date: Mon, 23 Oct 2000 14:55:43 +1100 (EST) From: Mark Goodwin X-Sender: markgw@sandpit.melbourne.sgi.com To: Nathan Scott cc: pcp@oss.sgi.com Subject: Re: Running PCP as non-root In-Reply-To: <10010231231.ZM83885@wobbly.melbourne.sgi.com> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-pcp@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;pcp-outgoing Content-Length: 1165 Lines: 26 You can run PCP as non-root, as Michal suggested. e.g. # mkdir ~/log # edit /etc/pcp.conf and change the PCP_LOG_DIR variable to the new dir # edit /etc/rc.d/init.d/pcp and comment out the root uid check. # then run "/etc/rc.d/init.d/pcp start" as yourself or some other user. In the next PCP release, I'll remove the root uid check in the rc script and replace it with a check if $PCP_LOG_DIR is writable by the current user. If you're using the cron scripts for pmlogger and/or pmie (see the man pages for pmlogger_check(1) and pmie_check(1), you'll need to move these from root's crontab to that for your chosen non-root user. The only other problem I can think of is that the optional PMDA Install scripts (in /var/pcp/pmdas) will want to manipulate files in directories owned by root. These Install scripts will also want to restart or hup pmcd. But once you have the optional PMDAs configured, you can restart as non-root. In theory you should be able to relocate all of PCP's var heirarchy by changing PCP_VAR_DIR, PCP_PMCDCONF_PATH, PCP_PMDAS_DIR, PCP_LOG_DIR and (perhaps) PCP_TMP_DIR to some place else [I haven't tried this]. -- Mark From owner-pcp@oss.sgi.com Sun Oct 22 22:25:05 2000 Received: by oss.sgi.com id ; Sun, 22 Oct 2000 22:24:55 -0700 Received: from pneumatic-tube.sgi.com ([204.94.214.22]:3620 "EHLO pneumatic-tube.sgi.com") by oss.sgi.com with ESMTP id ; Sun, 22 Oct 2000 22:24:30 -0700 Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by pneumatic-tube.sgi.com (980327.SGI.8.8.8-aspam/980310.SGI-aspam) via SMTP id WAA09806 for ; Sun, 22 Oct 2000 22:31:50 -0700 (PDT) mail_from (markgw@sgi.com) Received: from sandpit.melbourne.sgi.com (sandpit.melbourne.sgi.com [134.14.55.132]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id QAA10276; Mon, 23 Oct 2000 16:23:08 +1100 Date: Mon, 23 Oct 2000 16:19:56 +1100 (EST) From: Mark Goodwin X-Sender: markgw@sandpit.melbourne.sgi.com To: olemd@sgi.com cc: pcp@oss.sgi.com Subject: Re: PCP on sparc-linux (fwd) In-Reply-To: Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-pcp@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;pcp-outgoing Content-Length: 1545 Lines: 43 > Date: Thu, 10 Aug 2000 13:35:55 +0200 > From: Ole-Morten Duesund > Reply-To: olemd@acm.org > To: owner-pcp@oss.sgi.com > Subject: PCP on sparc-linux > > Hi, I'v ported pcp to work on sparc-linux too... Actually ported is a > big word, just a tiny patch really but it took some time to figure out > (doesn't it always?) > > Anyway, I've attached the patch against 2.1.7, I know it doesn't break > anything on x86 at least - and I can't imagine it breaking anything > anywhere else either. And most(?) importantly, it actually does work on > sparc-linux. Hi Ole-Morten, sorry this took so long to respond (somehow I missed your original mail). I'm just looking at your "pcp for sparc-linux" patch, and have a few questions; 1. for sparc linux, do you explicitly need -fPIC for shared libraries that will be loaded with dlopen at run time (such as pmdalinux.so)? Or do you need it for everything? If it's just for loadable shared libraries, then it should probably become a configure thing that is conditionally added to LCFLAGS in Makefiles for libraries that actually need it. 2. your other change was to proc_interrupts.h with some __sparc__ conditional header includes. Were these extra headers only needed for the #include immediately below? It turns out that kernel_stat.h is not actually needed - this is left over code from an earlier implementation ... so I'll just delete the unneeded include rather than add the conditional sparc code. thanks -- Mark From owner-pcp@oss.sgi.com Mon Oct 23 04:56:07 2000 Received: by oss.sgi.com id ; Mon, 23 Oct 2000 04:55:57 -0700 Received: from tah14.ctt.cz ([194.108.115.182]:8968 "EHLO arthur.plbohnice.cz") by oss.sgi.com with ESMTP id ; Mon, 23 Oct 2000 04:55:41 -0700 Received: (from lemming@localhost) by arthur.plbohnice.cz (8.9.3/8.10.1) id NAA04851 for pcp@oss.sgi.com; Mon, 23 Oct 2000 13:55:21 +0200 Date: Mon, 23 Oct 2000 13:55:21 +0200 From: lemming@arthur.plbohnice.cz To: pcp@oss.sgi.com Subject: Re: Running PCP as non-root Message-ID: <20001023135521.A4810@arthur.plbohnice.cz> References: <20001020100119.A7835@arthur.plbohnice.cz> <10010221431.ZM84156@wobbly.melbourne.sgi.com> <14835.37374.556589.216686@kuku.melbourne.sgi.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Mailer: Mutt 1.0.1i In-Reply-To: <14835.37374.556589.216686@kuku.melbourne.sgi.com>; from makc@sgi.com on Mon, Oct 23, 2000 at 12:18:54PM +1100 Sender: owner-pcp@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;pcp-outgoing Content-Length: 1064 Lines: 21 > As far as changing the model, I don't see the reason (other then > paranoia) to be non-priveleged because it will mean we would have to > introduce a concept of "pcp" user (remeber, init scripts are all > started by root and unless we specifically change uid, we're not going > to get any advantage here). It will also mean that should in the > future we'd have to make some kind of fancy ioctl-ing, it may not work > from the non-priveleged user and Linux doesn't have capabilities > yet. Or does it? What I was talking about was running PCP as non-root user, not installing it. It is our policy to not run network services as root unless it is required for the service to work. I think a good implementation would be to setup a new environment variable - as which user should the pmcd run. It would be "root" by default so it wouldn't break anything. Those wanting to run it as a non-priviledged user would have to change it. The init.d script might even change ownership of the log directory (?) and then run the pmcd by the su command. Michal From owner-pcp@oss.sgi.com Mon Oct 23 04:58:07 2000 Received: by oss.sgi.com id ; Mon, 23 Oct 2000 04:57:47 -0700 Received: from tah14.ctt.cz ([194.108.115.182]:30472 "EHLO arthur.plbohnice.cz") by oss.sgi.com with ESMTP id ; Mon, 23 Oct 2000 04:57:41 -0700 Received: (from lemming@localhost) by arthur.plbohnice.cz (8.9.3/8.10.1) id NAA04871 for pcp@oss.sgi.com; Mon, 23 Oct 2000 13:57:25 +0200 Date: Mon, 23 Oct 2000 13:57:25 +0200 From: lemming@arthur.plbohnice.cz To: pcp@oss.sgi.com Subject: Logging multiple remote hosts Message-ID: <20001023135725.B4810@arthur.plbohnice.cz> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Mailer: Mutt 1.0.1i Sender: owner-pcp@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;pcp-outgoing Content-Length: 1258 Lines: 28 Hello! I am setting up PCP logging for multiple remote hosts. I have create control file containing those lines: # local primary logger # /sbin/init.d/pcp added this default configuration # on Sat Oct 21 12:54:36 CEST 2000 LOCALHOSTNAME y n /var/log/pcp/pmlogger/LOCALHOSTNAME -c config.default # Loggers for Centrum mail1.centrum.cz n n PCP_LOG_DIR/mail1.centrum.cz -c ./config.apache hpweb.centrum.cz n n PCP_LOG_DIR/hpweb.centrum.cz -c ./config.apache_mysql sleuth1.centrum.cz n n PCP_LOG_DIR/sleuth1.centrum.cz -c ./config.apache sleuth.centrum.cz n n PCP_LOG_DIR/sleuth.centrum.cz -c ./config.apache_mysql moje.centrum.cz n n PCP_LOG_DIR/moje.centrum.cz -c ./config.apache_mysql ad.centrum.cz n n PCP_LOG_DIR/ad.centrum.cz -c ./config.apache_mysql x1.xchat.cz n n PCP_LOG_DIR/x1.xchat.cz -c ./config.apache_mysql db.centrum.cz n n PCP_LOG_DIR/db.centrum.cz -c ./config.apache_mysql host1.netcentrum.cz n n PCP_LOG_DIR/host1.netcentrum.cz -c ./config.apache_mysql After the start there is no error message, but the logging starts only for the first host (mail1). Other directories get created, but they are empty. Do you have an idea what could be wrong? Michal From owner-pcp@oss.sgi.com Mon Oct 23 06:12:10 2000 Received: by oss.sgi.com id ; Mon, 23 Oct 2000 06:12:00 -0700 Received: from heffalump.fnal.gov ([131.225.9.20]:25730 "EHLO fnal.gov") by oss.sgi.com with ESMTP id ; Mon, 23 Oct 2000 06:11:40 -0700 Received: from fnal.gov ([131.225.80.75]) by smtp.fnal.gov (PMDF V6.0-24 #44770) with ESMTP id <0G2V00C4HWLWEG@smtp.fnal.gov> for pcp@oss.sgi.com; Mon, 23 Oct 2000 08:10:44 -0500 (CDT) Date: Mon, 23 Oct 2000 08:10:43 -0500 From: Troy Dawson Subject: Re: Logging multiple remote hosts To: lemming@arthur.plbohnice.cz Cc: pcp@oss.sgi.com Message-id: <39F438D3.1FCC27FE@fnal.gov> MIME-version: 1.0 X-Mailer: Mozilla 4.73 [en] (X11; U; Linux 2.2.16-3smp i686) Content-type: text/plain; charset=us-ascii Content-transfer-encoding: 7bit X-Accept-Language: en References: <20001023135725.B4810@arthur.plbohnice.cz> Sender: owner-pcp@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;pcp-outgoing Content-Length: 1774 Lines: 41 Howdy, I was having the exact same problem, but I thought they had already incorperated the fix that they made for me into the latest releases. Which version are you running? Troy lemming@arthur.plbohnice.cz wrote: > > Hello! > > I am setting up PCP logging for multiple remote hosts. I have create control > file containing those lines: > > # local primary logger > # /sbin/init.d/pcp added this default configuration > # on Sat Oct 21 12:54:36 CEST 2000 > LOCALHOSTNAME y n /var/log/pcp/pmlogger/LOCALHOSTNAME -c config.default > > # Loggers for Centrum > mail1.centrum.cz n n PCP_LOG_DIR/mail1.centrum.cz -c ./config.apache > hpweb.centrum.cz n n PCP_LOG_DIR/hpweb.centrum.cz -c ./config.apache_mysql > sleuth1.centrum.cz n n PCP_LOG_DIR/sleuth1.centrum.cz -c ./config.apache > sleuth.centrum.cz n n PCP_LOG_DIR/sleuth.centrum.cz -c ./config.apache_mysql > moje.centrum.cz n n PCP_LOG_DIR/moje.centrum.cz -c ./config.apache_mysql > ad.centrum.cz n n PCP_LOG_DIR/ad.centrum.cz -c ./config.apache_mysql > x1.xchat.cz n n PCP_LOG_DIR/x1.xchat.cz -c ./config.apache_mysql > db.centrum.cz n n PCP_LOG_DIR/db.centrum.cz -c ./config.apache_mysql > host1.netcentrum.cz n n PCP_LOG_DIR/host1.netcentrum.cz -c ./config.apache_mysql > > After the start there is no error message, but the logging starts only for the > first host (mail1). Other directories get created, but they are empty. > > Do you have an idea what could be wrong? > > Michal -- __________________________________________________ Troy Dawson dawson@fnal.gov (630)840-6468 Fermilab ComputingDivision/OSS SCS Group __________________________________________________ From owner-pcp@oss.sgi.com Mon Oct 23 17:58:58 2000 Received: by oss.sgi.com id ; Mon, 23 Oct 2000 17:58:48 -0700 Received: from deliverator.sgi.com ([204.94.214.10]:61800 "EHLO deliverator.sgi.com") by oss.sgi.com with ESMTP id ; Mon, 23 Oct 2000 17:58:32 -0700 Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by deliverator.sgi.com (980309.SGI.8.8.8-aspam-6.2/980310.SGI-aspam) via SMTP id RAA00492 for ; Mon, 23 Oct 2000 17:50:43 -0700 (PDT) mail_from (markgw@sgi.com) Received: from sandpit.melbourne.sgi.com (sandpit.melbourne.sgi.com [134.14.55.132]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id LAA17133; Tue, 24 Oct 2000 11:57:05 +1100 Date: Tue, 24 Oct 2000 11:53:45 +1100 (EST) From: Mark Goodwin X-Sender: markgw@sandpit.melbourne.sgi.com To: Troy Dawson cc: lemming@arthur.plbohnice.cz, pcp@oss.sgi.com Subject: Re: Logging multiple remote hosts In-Reply-To: <39F438D3.1FCC27FE@fnal.gov> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-pcp@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;pcp-outgoing Content-Length: 523 Lines: 14 On Mon, 23 Oct 2000, Troy Dawson wrote: > Howdy, > I was having the exact same problem, but I thought they had already > incorperated the fix that they made for me into the latest releases. Which > version are you running? I just tried pcp-2.1.10 with with 7 pmloggers configured and it worked fine. Michal: could this perhaps be related to running as non-root user? And do you have any lingering pmlogger_check and/or pmcd_wait processes (these are the processes that actually launch the pmlogger processes)? -- Mark From owner-pcp@oss.sgi.com Mon Oct 23 23:25:00 2000 Received: by oss.sgi.com id ; Mon, 23 Oct 2000 23:24:50 -0700 Received: from tah14.ctt.cz ([194.108.115.182]:11526 "EHLO arthur.plbohnice.cz") by oss.sgi.com with ESMTP id ; Mon, 23 Oct 2000 23:24:29 -0700 Received: (from lemming@localhost) by arthur.plbohnice.cz (8.9.3/8.10.1) id IAA32111 for pcp@oss.sgi.com; Tue, 24 Oct 2000 08:24:01 +0200 Date: Tue, 24 Oct 2000 08:24:01 +0200 From: lemming@arthur.plbohnice.cz To: pcp@oss.sgi.com Subject: Re: Logging multiple remote hosts Message-ID: <20001024082401.B31908@arthur.plbohnice.cz> References: <39F438D3.1FCC27FE@fnal.gov> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Mailer: Mutt 1.0.1i In-Reply-To: ; from markgw@sgi.com on Tue, Oct 24, 2000 at 11:53:45AM +1100 Sender: owner-pcp@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;pcp-outgoing Content-Length: 540 Lines: 12 > I just tried pcp-2.1.10 with with 7 pmloggers configured and it worked fine. > Michal: could this perhaps be related to running as non-root user? > And do you have any lingering pmlogger_check and/or pmcd_wait processes > (these are the processes that actually launch the pmlogger processes)? No, it was pcp 2.1.7 - I didn't use 2.1.10 because of the problems with pmns not rebuilding we discussed yesterday. Now as they are solved, I have installed 2.1.10 on the logging machine and it started to work. Thank you, Michal From owner-pcp@oss.sgi.com Wed Oct 25 22:09:05 2000 Received: by oss.sgi.com id ; Wed, 25 Oct 2000 22:08:55 -0700 Received: from tah14.ctt.cz ([194.108.115.182]:48901 "EHLO arthur.plbohnice.cz") by oss.sgi.com with ESMTP id ; Wed, 25 Oct 2000 22:08:42 -0700 Received: (from lemming@localhost) by arthur.plbohnice.cz (8.9.3/8.10.1) id HAA16370 for pcp@oss.sgi.com; Thu, 26 Oct 2000 07:08:24 +0200 Date: Thu, 26 Oct 2000 07:08:24 +0200 From: lemming@arthur.plbohnice.cz To: pcp@oss.sgi.com Subject: Re: Logging multiple remote hosts Message-ID: <20001026070824.A16340@arthur.plbohnice.cz> References: <20001023135725.B4810@arthur.plbohnice.cz> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Mailer: Mutt 1.0.1i In-Reply-To: ; from kenmcd@melbourne.sgi.com on Thu, Oct 26, 2000 at 11:29:57AM +1100 Sender: owner-pcp@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;pcp-outgoing Content-Length: 395 Lines: 10 > Did this get resolved? Oh yes. (I think I sent the letter about that...) I installed pcp-2.1.10, rebuilt pmns (it was not automatically rebuilding after installing from sources, so I used 2.1.7. I solved the problem with Mark sending him patch so that pcp startup script rebuilds pmns also according to modification date.) With 2.1.10, it started to work fine. Thanks, Michal From owner-pcp@oss.sgi.com Fri Oct 27 02:23:39 2000 Received: by oss.sgi.com id ; Fri, 27 Oct 2000 02:23:20 -0700 Received: from tah14.ctt.cz ([194.108.115.182]:42507 "EHLO arthur.plbohnice.cz") by oss.sgi.com with ESMTP id ; Fri, 27 Oct 2000 02:22:57 -0700 Received: (from lemming@localhost) by arthur.plbohnice.cz (8.9.3/8.10.1) id LAA13362 for pcp@oss.sgi.com; Fri, 27 Oct 2000 11:22:14 +0200 Date: Fri, 27 Oct 2000 11:22:14 +0200 From: lemming@arthur.plbohnice.cz To: pcp@oss.sgi.com Subject: Getting differently-sampled values from PCP archive Message-ID: <20001027112214.A12785@arthur.plbohnice.cz> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Mailer: Mutt 1.0.1i Sender: owner-pcp@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;pcp-outgoing Content-Length: 3899 Lines: 114 Hello, it is me again ;) While developing PCPMON, I have come to a thing I believe is not my fault: I have archive with metrics 'apache.total_accesses' (A) and 'apache.total_kbytes' (B) gathered every 5 minutes and 'apache.busy_servers' (C) gathered every 30 seconds. (There are also another metrics but they don't matter now). When I want to create graph from these metrics, I am trying to get them all with one pmFetch(). But the fetch returns relevant data only for metric C, for metrics A and B it returns nonsense (see below). However, when I remove C from the graph, data are gathered OK. What is interesting - it "doesn't work" only in forward mode, in interpolation mode the values are OK (but values at end/beginning of the archive are irrelevant :( ) Debug outputs with explanation are below. What do you think? Can I do somethink to make it work. (I have used pcp 2.1.10) Michal There is debug output I got when gathering all three metrics. The 'Gathering for' value is an unix timestamp (secs.usecs): Gathering for 972368157.0 Gathering for 972368277.0 apache.total_accesses=135075376.000000 apache.total_kbytes=135075344.000000 apache.busy_servers=64.000000 Gathering for 972368337.0 apache.total_accesses=135075376.000000 apache.total_kbytes=135075344.000000 apache.busy_servers=64.000000 Gathering for 972368397.0 apache.total_accesses=135075376.000000 apache.total_kbytes=135075344.000000 apache.busy_servers=47.000000 Gathering for 972368457.0 apache.total_accesses=135075376.000000 apache.total_kbytes=135075344.000000 apache.busy_servers=38.000000 Gathering for 972368517.0 apache.total_accesses=135075376.000000 apache.total_kbytes=135075344.000000 apache.busy_servers=47.000000 Gathering for 972368577.0 apache.total_accesses=135075376.000000 apache.total_kbytes=135075344.000000 apache.busy_servers=40.000000 Gathering for 972368637.0 apache.total_accesses=135075376.000000 apache.total_kbytes=135075344.000000 apache.busy_servers=60.000000 Gathering for 972368697.0 apache.total_accesses=135075376.000000 apache.total_kbytes=135075344.000000 apache.busy_servers=43.000000 Gathering for 972368757.0 apache.total_accesses=135075376.000000 apache.total_kbytes=135075344.000000 apache.busy_servers=34.000000 Gathering for 972368817.0 apache.total_accesses=135075376.000000 apache.total_kbytes=135075344.000000 apache.busy_servers=37.000000 Gathering for 972368877.0 apache.total_accesses=135075376.000000 apache.total_kbytes=135075344.000000 apache.busy_servers=43.000000 Gathering for 972368937.0 apache.total_accesses=135075376.000000 apache.total_kbytes=135075344.000000 apache.busy_servers=36.000000 And now follows data for only first two metrics - they are OK. Gathering for 972368157.0 Gathering for 972368277.0 apache.total_accesses=215047.000000 apache.total_kbytes=290007.000000 Gathering for 972368337.0 apache.total_accesses=215047.000000 apache.total_kbytes=290007.000000 Gathering for 972368397.0 apache.total_accesses=215047.000000 apache.total_kbytes=290007.000000 Gathering for 972368457.0 apache.total_accesses=215047.000000 apache.total_kbytes=290007.000000 Gathering for 972368517.0 apache.total_accesses=215047.000000 apache.total_kbytes=290007.000000 Gathering for 972368577.0 apache.total_accesses=215047.000000 apache.total_kbytes=290007.000000 Gathering for 972368637.0 apache.total_accesses=217342.000000 apache.total_kbytes=293630.000000 Gathering for 972368697.0 apache.total_accesses=217342.000000 apache.total_kbytes=293630.000000 Gathering for 972368757.0 apache.total_accesses=217342.000000 apache.total_kbytes=293630.000000 Gathering for 972368817.0 apache.total_accesses=217342.000000 apache.total_kbytes=293630.000000 Gathering for 972368877.0 apache.total_accesses=217342.000000 apache.total_kbytes=293630.000000 Gathering for 972368937.0 apache.total_accesses=219576.000000 apache.total_kbytes=297111.000000 From owner-pcp@oss.sgi.com Sun Oct 29 18:34:11 2000 Received: by oss.sgi.com id ; Sun, 29 Oct 2000 18:34:01 -0800 Received: from pneumatic-tube.sgi.com ([204.94.214.22]:377 "EHLO pneumatic-tube.sgi.com") by oss.sgi.com with ESMTP id ; Sun, 29 Oct 2000 18:33:39 -0800 Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by pneumatic-tube.sgi.com (980327.SGI.8.8.8-aspam/980310.SGI-aspam) via SMTP id SAA00012 for ; Sun, 29 Oct 2000 18:41:01 -0800 (PST) mail_from (markgw@sgi.com) Received: from sandpit.melbourne.sgi.com (sandpit.melbourne.sgi.com [134.14.55.132]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id NAA00014; Mon, 30 Oct 2000 13:31:42 +1100 Date: Mon, 30 Oct 2000 13:31:41 +1100 (EST) From: Mark Goodwin X-Sender: markgw@sandpit.melbourne.sgi.com To: lemming@arthur.plbohnice.cz cc: pcp@oss.sgi.com Subject: Re: Getting differently-sampled values from PCP archive In-Reply-To: <20001027112214.A12785@arthur.plbohnice.cz> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-pcp@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;pcp-outgoing Content-Length: 4394 Lines: 128 On Fri, 27 Oct 2000 lemming@arthur.plbohnice.cz wrote: > > When I want to create graph from these metrics, I am trying to get them all > with one pmFetch(). But the fetch returns relevant data only for metric C, for > metrics A and B it returns nonsense (see below). However, when I remove C from > the graph, data are gathered OK. You should definately be able to replay/fetch them all with one pmFetch. But note that for metrics collected using pmlogger at different sampling rates, you will get a "staggered start" of metrics being available at the beginning of the archive. Perhaps try replaying by starting at a time stamp that is advanced forward from the start by (at least) an interval equal to the longest sample time. > > What is interesting - it "doesn't work" only in forward mode, in > interpolation mode the values are OK (but values at end/beginning of the archive > are irrelevant :( ) Could you send me a copy of your archive for me to test? > > Debug outputs with explanation are below. What do you think? Can I do > somethink to make it work. (I have used pcp 2.1.10) I don't really see what's wrong with the debug stuff below. Can you please explain in more detail? thanks -- Mark > > > There is debug output I got when gathering all three metrics. The 'Gathering > for' value is an unix timestamp (secs.usecs): > > Gathering for 972368157.0 > Gathering for 972368277.0 > apache.total_accesses=135075376.000000 > apache.total_kbytes=135075344.000000 > apache.busy_servers=64.000000 > Gathering for 972368337.0 > apache.total_accesses=135075376.000000 > apache.total_kbytes=135075344.000000 > apache.busy_servers=64.000000 > Gathering for 972368397.0 > apache.total_accesses=135075376.000000 > apache.total_kbytes=135075344.000000 > apache.busy_servers=47.000000 > Gathering for 972368457.0 > apache.total_accesses=135075376.000000 > apache.total_kbytes=135075344.000000 > apache.busy_servers=38.000000 > Gathering for 972368517.0 > apache.total_accesses=135075376.000000 > apache.total_kbytes=135075344.000000 > apache.busy_servers=47.000000 > Gathering for 972368577.0 > apache.total_accesses=135075376.000000 > apache.total_kbytes=135075344.000000 > apache.busy_servers=40.000000 > Gathering for 972368637.0 > apache.total_accesses=135075376.000000 > apache.total_kbytes=135075344.000000 > apache.busy_servers=60.000000 > Gathering for 972368697.0 > apache.total_accesses=135075376.000000 > apache.total_kbytes=135075344.000000 > apache.busy_servers=43.000000 > Gathering for 972368757.0 > apache.total_accesses=135075376.000000 > apache.total_kbytes=135075344.000000 > apache.busy_servers=34.000000 > Gathering for 972368817.0 > apache.total_accesses=135075376.000000 > apache.total_kbytes=135075344.000000 > apache.busy_servers=37.000000 > Gathering for 972368877.0 > apache.total_accesses=135075376.000000 > apache.total_kbytes=135075344.000000 > apache.busy_servers=43.000000 > Gathering for 972368937.0 > apache.total_accesses=135075376.000000 > apache.total_kbytes=135075344.000000 > apache.busy_servers=36.000000 > > And now follows data for only first two metrics - they are OK. > > Gathering for 972368157.0 > Gathering for 972368277.0 > apache.total_accesses=215047.000000 > apache.total_kbytes=290007.000000 > Gathering for 972368337.0 > apache.total_accesses=215047.000000 > apache.total_kbytes=290007.000000 > Gathering for 972368397.0 > apache.total_accesses=215047.000000 > apache.total_kbytes=290007.000000 > Gathering for 972368457.0 > apache.total_accesses=215047.000000 > apache.total_kbytes=290007.000000 > Gathering for 972368517.0 > apache.total_accesses=215047.000000 > apache.total_kbytes=290007.000000 > Gathering for 972368577.0 > apache.total_accesses=215047.000000 > apache.total_kbytes=290007.000000 > Gathering for 972368637.0 > apache.total_accesses=217342.000000 > apache.total_kbytes=293630.000000 > Gathering for 972368697.0 > apache.total_accesses=217342.000000 > apache.total_kbytes=293630.000000 > Gathering for 972368757.0 > apache.total_accesses=217342.000000 > apache.total_kbytes=293630.000000 > Gathering for 972368817.0 > apache.total_accesses=217342.000000 > apache.total_kbytes=293630.000000 > Gathering for 972368877.0 > apache.total_accesses=217342.000000 > apache.total_kbytes=293630.000000 > Gathering for 972368937.0 > apache.total_accesses=219576.000000 > apache.total_kbytes=297111.000000 > From owner-pcp@oss.sgi.com Sun Oct 29 22:40:11 2000 Received: by oss.sgi.com id ; Sun, 29 Oct 2000 22:40:01 -0800 Received: from ad202.166.69.251.magix.com.sg ([202.166.69.251]:49397 "EHLO singnet.com.sg") by oss.sgi.com with ESMTP id ; Sun, 29 Oct 2000 22:39:47 -0800 Received: from localhost (localhost [127.0.0.1]) (uid 502) by singnet.com.sg with local; Mon, 30 Oct 2000 14:39:46 +0800 Date: Mon, 30 Oct 2000 14:39:46 +0800 From: Chai Mee Joon To: pcp@oss.sgi.com Subject: Re: PCP Build for Alpha Message-ID: <20001030143946.A9673@demo1.cybernix.com.sg> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Content-Disposition: inline User-Agent: Mutt/1.2.5i Sender: owner-pcp@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;pcp-outgoing Content-Length: 680 Lines: 30 Hi Mark, > I am interested from the portability perspective, but you are the only > PCP on Linux-Alpha user that I know of. I just subscribed to the list & read your post on the archive. Well, besides Eric, I am glad to inform the list that you have another Linux-Alpha user. I just grabbed the pcp 2.1.10 sources and it compiled in on my machine without problems. Now running pmcd---pmdaweblog. If you're interested in the specs of the Alpha I'm using: Alpha 21264 (EV6) 750mhz / 8mb on-chip cache, 512mb ecc ram, running SuSE-Linux AXP 6.4 Cheers, Chai Mee Joon, Lawrence 4 Pandan Valley #14-413 Singapore 597628 Mobile: (+65) 98430098 From owner-pcp@oss.sgi.com Mon Oct 30 20:46:54 2000 Received: by oss.sgi.com id ; Mon, 30 Oct 2000 20:46:44 -0800 Received: from deliverator.sgi.com ([204.94.214.10]:45615 "EHLO convert rfc822-to-8bit deliverator.sgi.com") by oss.sgi.com with ESMTP id ; Mon, 30 Oct 2000 20:46:34 -0800 Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by deliverator.sgi.com (980309.SGI.8.8.8-aspam-6.2/980310.SGI-aspam) via SMTP id UAA07326 for ; Mon, 30 Oct 2000 20:38:43 -0800 (PST) mail_from (markgw@sgi.com) Received: from sandpit.melbourne.sgi.com (sandpit.melbourne.sgi.com [134.14.55.132]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id PAA09280; Tue, 31 Oct 2000 15:43:57 +1100 Date: Tue, 31 Oct 2000 15:43:56 +1100 (EST) From: Mark Goodwin X-Sender: markgw@sandpit.melbourne.sgi.com To: The Lemming cc: pcp@oss.sgi.com Subject: Re: Getting differently-sampled values from PCP archive In-Reply-To: <20001030085950.A28597@arthur.plbohnice.cz> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=X-UNKNOWN Content-Transfer-Encoding: 8BIT Sender: owner-pcp@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;pcp-outgoing Content-Length: 3478 Lines: 75 On Mon, 30 Oct 2000, The Lemming wrote: > > Could you send me a copy of your archive for me to test? > > I am attaching it to this letter. > > > > Debug outputs with explanation are below. What do you think? Can I do > > > somethink to make it work. (I have used pcp 2.1.10) > > > > I don't really see what's wrong with the debug stuff below. Can you please > > explain in more detail? > > When I fetch all three metrics, the values for apache.total_accesses and > apache.total_kbytes are some high numbers that doesn't make any sense. When I > try to fetch just those two (using exactly the same time ,as you can see from > the 'Gathering for' timestamps), I get values that are OK and you can see that > they have no connection to the values got in previous case. It looks like apache.busy_servers is reporting bad values too. In the archive you sent me, the highest value for apache.busy_servers was 7, yet you are reporting values as high as 64. Are we looking at the same archive? In any case, there is a very fundamental difference between PM_MODE_FORW and PM_MODE_INTERP. You need to read the pmSetMode(3) man page very carefully, especially this bit: # If the mode is PM_MODE_FORW then, in the case of pmFetch(3), the collec­ # tion of recorded metric values will be scanned in a forwards direction in # time, until values for at least one of the requested metrics is located # after the time origin, and then all requested metrics stored in the log or # archive at that time will be returned with the corresponding timestamp. A # mode of PM_MODE_FORW may only be used with an archive context. The "hidden meaning" here is that when using PM_MODE_FORW, you only get values back for metrics which are actually available for the time you asked for them. You can see this with pmdumplog, e.g. : pmdumplog -a 20001024.08.09 apache.total_accesses \ apache.total_kbytes apache.busy_servers # ... lots deleted # 17:19:52.999 68.0.0 (apache.total_accesses): No values returned! # 68.0.1 (apache.total_kbytes): No values returned! # 68.0.6 (apache.busy_servers): value 1 # # [68 bytes] # 17:19:53.003 68.0.0 (apache.total_accesses): value 791 # 68.0.1 (apache.total_kbytes): value 1040 # 68.0.6 (apache.busy_servers): No values returned! # # [56 bytes] # 17:20:53.001 68.0.0 (apache.total_accesses): No values returned! # 68.0.1 (apache.total_kbytes): No values returned! # 68.0.6 (apache.busy_servers): value 1 # ... lots more deleted See how the availability of values is interleaved? I can't be sure, but does your code assume a value is returned for every metric in every fetch? When using PM_MODE_FORW, that assumption is not true for the case where different metrics are logged with different sampling intervals. However, when using PM_MODE_INTERP it is true. Hence you should always use PM_MODE_INTERP in your code. This will also allow you to replay archives at any update/sampling interval, which is not possible with PM_MODE_FORW. In addition, if you have PM_SEM_INSTANT metrics that should not be interpolated during archive replay, (e.g. perhaps apache.busy_servers because it does not make sense to have 0.6 servers) then you should change their semantics to PM_SEM_DISCRETE instead. This tells pmFetch not to interpolate values between samples. See the man page for pmLookupDesc(3) for details. I hope all this is not too confusing!! -- Mark From owner-pcp@oss.sgi.com Tue Oct 31 10:39:11 2000 Received: by oss.sgi.com id ; Tue, 31 Oct 2000 10:38:51 -0800 Received: from deliverator.sgi.com ([204.94.214.10]:14341 "EHLO deliverator.sgi.com") by oss.sgi.com with ESMTP id ; Tue, 31 Oct 2000 10:38:35 -0800 Received: from rattle.melbourne.sgi.com (rattle.melbourne.sgi.com [134.14.55.145]) by deliverator.sgi.com (980309.SGI.8.8.8-aspam-6.2/980310.SGI-aspam) via ESMTP id KAA26622 for ; Tue, 31 Oct 2000 10:30:45 -0800 (PST) mail_from (kenmcd@melbourne.sgi.com) Received: from localhost (kenmcd@localhost) by rattle.melbourne.sgi.com (SGI-8.9.3/8.9.3) with ESMTP id FAA62970; Wed, 1 Nov 2000 05:35:46 +1100 (AEDT) X-Authentication-Warning: rattle.melbourne.sgi.com: kenmcd owned process doing -bs Date: Wed, 1 Nov 2000 05:35:46 +1100 From: Ken McDonell Reply-To: kenmcd@sgi.com To: Mark Goodwin cc: The Lemming , pcp@oss.sgi.com Subject: Re: Getting differently-sampled values from PCP archive In-Reply-To: Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-pcp@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;pcp-outgoing Content-Length: 1962 Lines: 52 On Tue, 31 Oct 2000, Mark Goodwin wrote: > > [helpful stuff explaining Michal's "problem" deleted] > > In addition, if you have PM_SEM_INSTANT metrics that should not be > interpolated during archive replay, (e.g. perhaps apache.busy_servers because > it does not make sense to have 0.6 servers) then you should change their > semantics to PM_SEM_DISCRETE instead. This tells pmFetch not to interpolate > values between samples. See the man page for pmLookupDesc(3) for details. This part is not quite correct. For archives, pmSetMode is used to establish a desired time (the when argument) within the span of the archive and the mode and delta arguments define what happens on subsequent pmFetches according to the following semantics: PM_MODE_FORW or PM_MODE_BACK Ignore delta. Go to the next record in the archive that contains at least one of the requested metrics (as described by Mark). PM_MODE_INTERP March the time by delta at each pmFetch (negative delta means scan the archive backwards). When fetching use sensible heuristics to compute a reasonable value for all requested metrics at the desired time. The semantics of each requested metrics influences the "sensible heuristics" as follows: PM_SEM_COUNTER Linear interpolation between the two observations that most closely bound the desired time. PM_SEM_DISCRETE Use the closest earlier value. PM_SEM_INSTANT Same as PM_SEM_DISCRETE, except there must be two observations that bound the desired time. In practise, PM_SEM_DISCRETE is used for metrics that are most unlikely to change during the archive, e.g. hinv.ncpu, while PM_SEM_INSTANT is used for metrics where the value is a snapshot of some underlying and changing variable, e.g. filesys.free. So apache.busy_servers should be PM_SEM_INSTANT and the value when using PM_MODE_INTERP will be integral and akin to the "most recent observed value" relative to the desired pmFetch time. From owner-pcp@oss.sgi.com Tue Oct 31 12:18:41 2000 Received: by oss.sgi.com id ; Tue, 31 Oct 2000 12:18:32 -0800 Received: from ex1.ncsa.uiuc.edu ([141.142.2.9]:24249 "EHLO ex1.ncsa.uiuc.edu") by oss.sgi.com with ESMTP id ; Tue, 31 Oct 2000 12:18:13 -0800 Received: from mx1.ncsa.uiuc.edu (mx1.ncsa.uiuc.edu [141.142.2.8]) by ex1.ncsa.uiuc.edu (8.11.0/8.11.0) with ESMTP id e9VKIBJ19624 for ; Tue, 31 Oct 2000 14:18:12 -0600 (CST) Received: from osage.ncsa.uiuc.edu (osage.ncsa.uiuc.edu [141.142.2.56]) by mx1.ncsa.uiuc.edu (8.11.0/8.11.0) with ESMTP id e9VKIB318899 for ; Tue, 31 Oct 2000 14:18:11 -0600 (CST) Received: from localhost (abailey@localhost) by osage.ncsa.uiuc.edu (8.9.3/8.9.3) with ESMTP id OAA14211 for ; Tue, 31 Oct 2000 14:18:10 -0600 Date: Tue, 31 Oct 2000 14:18:10 -0600 (CST) From: Alan Bailey To: pcp@oss.sgi.com Subject: pmlogger control file In-Reply-To: Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-pcp@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;pcp-outgoing Content-Length: 2970 Lines: 72 Did the format of the pmlogger control file get changed without the documentation getting changed? I'm setting up a logging machine to log a few metrics from a bunch of hosts. When I have the control file setup the way the documentation explains, it grabs one of the 'n's as the directory name, and creates it. It then uses 'n' for the directories for each host, which is invalid also. Should there just be one 'n' in the config file? It seems to work that way. I'm running pcp-2.1.10 on RH 6.2. Thanks for any help, Alan output below: ------------ [root@lanner abailey]# cat /var/pcp/config/pmlogger/control.cluster ntsc1033 n n PCP_LOG_DIR/pmlogger/ntsc1033 -c config.cluster ntsc1034 n n PCP_LOG_DIR/pmlogger/ntsc1034 -c config.cluster ntsc1035 n n PCP_LOG_DIR/pmlogger/ntsc1035 -c config.cluster [root@lanner abailey]# /usr/share/pcp/bin/pmlogger_check -c /var/pcp/config/pmlogger/control.cluster pmlogger_check [/var/pcp/config/pmlogger/control.cluster:1] Warning: creating directory (n) for PCP archive files Restarting pmlogger for host "ntsc1033" ... timed out waiting! Usage: pmlogger [options] archive Options: -c configfile file to load configuration from -h host metrics source is PMCD on host -l logfile redirect diagnostics and trace output -L linger, even if not primary logger instance and nothing to log -n pmnsfile use an alternative PMNS -P execute as primary logger instance -r report record sizes and archive growth rate -s endsize terminate after endsize has been accumulated -t interval default logging interval [default 60.0 seconds] -T endtime terminate at given time -v volsize switch log volumes after volsize has been accumulated -V version generate version 1 or 2 archives (default is 2) -x fd control file descriptor for application launching pmlogger via pmRecordControl(3) pmlogger_check: Error: cannot find pmlogger output file at "pmlogger.log" Directory (/home/abailey/n) contents: total 8 drwxrwxr-x 2 root root 4096 Oct 31 14:09 . drwx------ 18 abailey abailey 4096 Oct 31 14:09 .. ---------- 1 root root 0 Oct 31 14:09 lock pmlogger_check: Error: archive file 20001031.14.09.0 missing Directory (/home/abailey/n) contents: total 8 drwxrwxr-x 2 root root 4096 Oct 31 14:09 . drwx------ 18 abailey abailey 4096 Oct 31 14:09 .. ---------- 1 root root 0 Oct 31 14:09 lock pmlogger_check [/var/pcp/config/pmlogger/control.cluster:2] Warning: creating directory (n) for PCP archive files pmlogger_check: [/var/pcp/config/pmlogger/control.cluster:2] Error: Cannot start more than one pmlogger instance for archive directory "n" ... logging for host "ntsc1034" unchanged pmlogger_check: [/var/pcp/config/pmlogger/control.cluster:3] Error: Cannot start more than one pmlogger instance for archive directory "n" ... logging for host "ntsc1035" unchanged [root@lanner abailey]# From owner-pcp@oss.sgi.com Tue Oct 31 12:39:02 2000 Received: by oss.sgi.com id ; Tue, 31 Oct 2000 12:38:53 -0800 Received: from deliverator.sgi.com ([204.94.214.10]:28987 "EHLO deliverator.sgi.com") by oss.sgi.com with ESMTP id ; Tue, 31 Oct 2000 12:38:38 -0800 Received: from rattle.melbourne.sgi.com (rattle.melbourne.sgi.com [134.14.55.145]) by deliverator.sgi.com (980309.SGI.8.8.8-aspam-6.2/980310.SGI-aspam) via ESMTP id MAA23558 for ; Tue, 31 Oct 2000 12:30:48 -0800 (PST) mail_from (kenmcd@melbourne.sgi.com) Received: from localhost (kenmcd@localhost) by rattle.melbourne.sgi.com (SGI-8.9.3/8.9.3) with ESMTP id HAA68475; Wed, 1 Nov 2000 07:36:04 +1100 (AEDT) X-Authentication-Warning: rattle.melbourne.sgi.com: kenmcd owned process doing -bs Date: Wed, 1 Nov 2000 07:36:03 +1100 From: Ken McDonell Reply-To: kenmcd@sgi.com To: Alan Bailey cc: pcp@oss.sgi.com Subject: Re: pmlogger control file In-Reply-To: Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-pcp@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;pcp-outgoing Content-Length: 3788 Lines: 105 On Tue, 31 Oct 2000, Alan Bailey wrote: > > Did the format of the pmlogger control file get changed without the > documentation getting changed? Absolutely not, unless someone has a strong death wish! > I'm setting up a logging machine to log a few metrics from a bunch of > hosts. When I have the control file setup the way the documentation > explains, it grabs one of the 'n's as the directory name, and creates it. > It then uses 'n' for the directories for each host, which is invalid also. > Should there just be one 'n' in the config file? It seems to work that > way. Could you please send me your /var/pcp/config/pmlogger/control.cluster file as an attachment so I can check for whitespace strangeness? > I'm running pcp-2.1.10 on RH 6.2. That is a known good combination. Some other things that would help ... 1. output from $ for f in /var/tmp/pmlogger/*; do echo === $f ===; cat $f; done before and after running /usr/share/pcp/bin/pmlogger_check 2. output from $ /usr/share/pcp/bin/pmlogger_check -NV 3. output from $ sh -x /usr/share/pcp/bin/pmlogger_check > Thanks for any help, > Alan > > output below: > ------------ > > [root@lanner abailey]# cat /var/pcp/config/pmlogger/control.cluster > ntsc1033 n n PCP_LOG_DIR/pmlogger/ntsc1033 -c config.cluster > ntsc1034 n n PCP_LOG_DIR/pmlogger/ntsc1034 -c config.cluster > ntsc1035 n n PCP_LOG_DIR/pmlogger/ntsc1035 -c config.cluster > [root@lanner abailey]# /usr/share/pcp/bin/pmlogger_check -c > /var/pcp/config/pmlogger/control.cluster > pmlogger_check [/var/pcp/config/pmlogger/control.cluster:1] > Warning: creating directory (n) for PCP archive files OK, this looks wrong, but your control file looks OK. > Restarting pmlogger for host "ntsc1033" ... timed out waiting! This looks like a second problem ... > Usage: pmlogger [options] archive > > Options: > -c configfile file to load configuration from > -h host metrics source is PMCD on host > -l logfile redirect diagnostics and trace output > -L linger, even if not primary logger instance and > nothing to log > -n pmnsfile use an alternative PMNS > -P execute as primary logger instance > -r report record sizes and archive growth rate > -s endsize terminate after endsize has been accumulated > -t interval default logging interval [default 60.0 seconds] > -T endtime terminate at given time > -v volsize switch log volumes after volsize has been > accumulated > -V version generate version 1 or 2 archives (default is 2) > -x fd control file descriptor for application launching > pmlogger > via pmRecordControl(3) > pmlogger_check: Error: cannot find pmlogger output file at "pmlogger.log" > Directory (/home/abailey/n) contents: > total 8 > drwxrwxr-x 2 root root 4096 Oct 31 14:09 . > drwx------ 18 abailey abailey 4096 Oct 31 14:09 .. > ---------- 1 root root 0 Oct 31 14:09 lock > pmlogger_check: Error: archive file 20001031.14.09.0 missing > Directory (/home/abailey/n) contents: > total 8 > drwxrwxr-x 2 root root 4096 Oct 31 14:09 . > drwx------ 18 abailey abailey 4096 Oct 31 14:09 .. > ---------- 1 root root 0 Oct 31 14:09 lock > pmlogger_check [/var/pcp/config/pmlogger/control.cluster:2] > Warning: creating directory (n) for PCP archive files > pmlogger_check: [/var/pcp/config/pmlogger/control.cluster:2] > Error: Cannot start more than one pmlogger instance for archive directory > "n" > ... logging for host "ntsc1034" unchanged > pmlogger_check: [/var/pcp/config/pmlogger/control.cluster:3] > Error: Cannot start more than one pmlogger instance for archive directory > "n" > ... logging for host "ntsc1035" unchanged > [root@lanner abailey]# > > From owner-pcp@oss.sgi.com Tue Oct 31 13:20:03 2000 Received: by oss.sgi.com id ; Tue, 31 Oct 2000 13:19:43 -0800 Received: from ex1.ncsa.uiuc.edu ([141.142.2.9]:10177 "EHLO ex1.ncsa.uiuc.edu") convert rfc822-to-8bit by oss.sgi.com with ESMTP id ; Tue, 31 Oct 2000 13:19:12 -0800 Received: from mx1.ncsa.uiuc.edu (mx1.ncsa.uiuc.edu [141.142.2.8]) by ex1.ncsa.uiuc.edu (8.11.0/8.11.0) with ESMTP id e9VLJAJ24346; Tue, 31 Oct 2000 15:19:10 -0600 (CST) Received: from osage.ncsa.uiuc.edu (osage.ncsa.uiuc.edu [141.142.2.56]) by mx1.ncsa.uiuc.edu (8.11.0/8.11.0) with ESMTP id e9VLJ9328252; Tue, 31 Oct 2000 15:19:09 -0600 (CST) Received: from localhost (abailey@localhost) by osage.ncsa.uiuc.edu (8.9.3/8.9.3) with ESMTP id PAA14831; Tue, 31 Oct 2000 15:19:09 -0600 Date: Tue, 31 Oct 2000 15:19:08 -0600 (CST) From: Alan Bailey To: kenmcd@sgi.com cc: pcp@oss.sgi.com Subject: Re: pmlogger control file In-Reply-To: Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Content-Transfer-Encoding: 8BIT Sender: owner-pcp@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;pcp-outgoing Content-Length: 16310 Lines: 499 > Could you please send me your /var/pcp/config/pmlogger/control.cluster > file as an attachment so I can check for whitespace strangeness? I will, I'll send it to you and not the list. > > > I'm running pcp-2.1.10 on RH 6.2. > > That is a known good combination. > > Some other things that would help ... > > 1. output from > > $ for f in /var/tmp/pmlogger/*; do echo === $f ===; cat $f; done > > before and after running /usr/share/pcp/bin/pmlogger_check > before: [root@lanner /root]# for f in /var/tmp/pmlogger/*; do echo === $f ===; cat $f; done === /var/tmp/pmlogger/* === cat: /var/tmp/pmlogger/*: No such file or directory [root@lanner /root]# (i started fresh) after: and afterwards there weren't any files either. > 2. output from > > $ /usr/share/pcp/bin/pmlogger_check -NV [root@lanner /root]# /usr/share/pcp/bin/pmlogger_check -NV -c /var/pcp/config/pmlogger/control.cluster pmlogger_check [/var/pcp/config/pmlogger/control.cluster:1] Warning: creating directory (n) for PCP archive files + cd /root/n Restarting pmlogger for host "ntsc1033" ... + pmlogger -h ntsc1033 /var/log/pcp/pmlogger/ntsc1033 -c config.cluster 20001031.14.48 pmlogger_check [/var/pcp/config/pmlogger/control.cluster:2] Warning: creating directory (n) for PCP archive files pmlogger_check: [/var/pcp/config/pmlogger/control.cluster:2] Error: Cannot start more than one pmlogger instance for archive directory "n" ... logging for host "ntsc1034" unchanged pmlogger_check: [/var/pcp/config/pmlogger/control.cluster:3] Error: Cannot start more than one pmlogger instance for archive directory "n" ... logging for host "ntsc1035" unchanged > > 3. output from > > $ sh -x /usr/share/pcp/bin/pmlogger_check obviously quite long, hopefully it will help: + . /etc/pcp.env ++ __CONF=/etc/pcp.conf ++ [ ! -f /etc/pcp.conf ] +++ sed -e s/"/\"/g /etc/pcp.conf +++ awk -F= /^PCP_/ && NF == 2 { exports=exports" "$1 printf "%s=${%s:-\"%s\"}\n", $1, $1, $2 } END { print "export", exports } ++ eval PCP_RC_DIR=${PCP_RC_DIR:-"/etc/rc.d/init.d"} PCP_SYSCONFIG_DIR=${PCP_SYSCONFIG_DIR:-"/etc/sysconfig"} PCP_BIN_DIR=${PCP_BIN_DIR:-"/usr/bin"} PCP_BINADM_DIR=${PCP_BINADM_DIR:-"/usr/share/pcp/bin"} PCP_LIB_DIR=${PCP_LIB_DIR:-"/usr/lib"} PCP_SHARE_DIR=${PCP_SHARE_DIR:-"/usr/share/pcp"} PCP_INC_DIR=${PCP_INC_DIR:-"/usr/include/pcp"} PCP_MAN_DIR=${PCP_MAN_DIR:-"/usr/man"} PCP_VAR_DIR=${PCP_VAR_DIR:-"/var/pcp"} PCP_PMCDCONF_PATH=${PCP_PMCDCONF_PATH:-"/var/pcp/config/pmcd/pmcd.conf"} PCP_PMCDOPTIONS_PATH=${PCP_PMCDOPTIONS_PATH:-"/var/pcp/config/pmcd/pmcd.options"} PCP_PMDAS_DIR=${PCP_PMDAS_DIR:-"/var/pcp/pmdas"} PCP_LOG_DIR=${PCP_LOG_DIR:-"/var/log/pcp"} PCP_TMP_DIR=${PCP_TMP_DIR:-"/var/tmp"} PCP_DOC_DIR=${PCP_DOC_DIR:-"/usr/doc/pcp-2.1.10"} PCP_DEMOS_DIR=${PCP_DEMOS_DIR:-"/usr/share/pcp/demos"} PCP_MAGIC_FILE=${PCP_MAGIC_FILE:-"/usr/share/magic"} PCP_AWK_PROG=${PCP_AWK_PROG:-"gawk"} PCP_CPP_PROG=${PCP_CPP_PROG:-"/lib/cpp -P -traditional -undef"} PCP_PS_HAVE_BSD=${PCP_PS_HAVE_BSD:-"false"} PCP_PS_ALL_FLAGS=${PCP_PS_ALL_FLAGS:-"-efw"} PCP_PLATFORM=${PCP_PLATFORM:-"linux"} PCP_VERSION=${PCP_VERSION:-"2.1.10-8"} PCP_XCONFIRM_PROG=${PCP_XCONFIRM_PROG:-"/usr/share/pcp/lib/xconfirm"} export PCP_RC_DIR PCP_SYSCONFIG_DIR PCP_BIN_DIR PCP_BINADM_DIR PCP_LIB_DIR PCP_SHARE_DIR PCP_INC_DIR PCP_MAN_DIR PCP_VAR_DIR PCP_PMCDCONF_PATH PCP_PMCDOPTIONS_PATH PCP_PMDAS_DIR PCP_LOG_DIR PCP_TMP_DIR PCP_DOC_DIR PCP_DEMOS_DIR PCP_MAGIC_FILE PCP_AWK_PROG PCP_CPP_PROG PCP_PS_HAVE_BSD PCP_PS_ALL_FLAGS PCP_PLATFORM PCP_VERSION PCP_XCONFIRM_PROG +++ PCP_RC_DIR=/etc/rc.d/init.d +++ PCP_SYSCONFIG_DIR=/etc/sysconfig +++ PCP_BIN_DIR=/usr/bin +++ PCP_BINADM_DIR=/usr/share/pcp/bin +++ PCP_LIB_DIR=/usr/lib +++ PCP_SHARE_DIR=/usr/share/pcp +++ PCP_INC_DIR=/usr/include/pcp +++ PCP_MAN_DIR=/usr/man +++ PCP_VAR_DIR=/var/pcp +++ PCP_PMCDCONF_PATH=/var/pcp/config/pmcd/pmcd.conf +++ PCP_PMCDOPTIONS_PATH=/var/pcp/config/pmcd/pmcd.options +++ PCP_PMDAS_DIR=/var/pcp/pmdas +++ PCP_LOG_DIR=/var/log/pcp +++ PCP_TMP_DIR=/var/tmp +++ PCP_DOC_DIR=/usr/doc/pcp-2.1.10 +++ PCP_DEMOS_DIR=/usr/share/pcp/demos +++ PCP_MAGIC_FILE=/usr/share/magic +++ PCP_AWK_PROG=gawk +++ PCP_CPP_PROG=/lib/cpp -P -traditional -undef +++ PCP_PS_HAVE_BSD=false +++ PCP_PS_ALL_FLAGS=-efw +++ PCP_PLATFORM=linux +++ PCP_VERSION=2.1.10-8 +++ PCP_XCONFIRM_PROG=/usr/share/pcp/lib/xconfirm +++ export PCP_RC_DIR PCP_SYSCONFIG_DIR PCP_BIN_DIR PCP_BINADM_DIR PCP_LIB_DIR PCP_SHARE_DIR PCP_INC_DIR PCP_MAN_DIR PCP_VAR_DIR PCP_PMCDCONF_PATH PCP_PMCDOPTIONS_PATH PCP_PMDAS_DIR PCP_LOG_DIR PCP_TMP_DIR PCP_DOC_DIR PCP_DEMOS_DIR PCP_MAGIC_FILE PCP_AWK_PROG PCP_CPP_PROG PCP_PS_HAVE_BSD PCP_PS_ALL_FLAGS PCP_PLATFORM PCP_VERSION PCP_XCONFIRM_PROG ++ PATH=/usr/sbin:/sbin:/bin:/usr/bin:/usr/bsd:/etc:/usr/etc:/usr/bin:/usr/share/pcp/bin:/usr/share/pcp/bin:/usr/bin/X11 ++ export PATH + . /usr/share/pcp/lib/rc-proc.sh ++ . /etc/pcp.env +++ __CONF=/etc/pcp.conf +++ [ ! -f /etc/pcp.conf ] ++++ sed -e s/"/\"/g /etc/pcp.conf ++++ awk -F= /^PCP_/ && NF == 2 { exports=exports" "$1 printf "%s=${%s:-\"%s\"}\n", $1, $1, $2 } END { print "export", exports } +++ eval PCP_RC_DIR=${PCP_RC_DIR:-"/etc/rc.d/init.d"} PCP_SYSCONFIG_DIR=${PCP_SYSCONFIG_DIR:-"/etc/sysconfig"} PCP_BIN_DIR=${PCP_BIN_DIR:-"/usr/bin"} PCP_BINADM_DIR=${PCP_BINADM_DIR:-"/usr/share/pcp/bin"} PCP_LIB_DIR=${PCP_LIB_DIR:-"/usr/lib"} PCP_SHARE_DIR=${PCP_SHARE_DIR:-"/usr/share/pcp"} PCP_INC_DIR=${PCP_INC_DIR:-"/usr/include/pcp"} PCP_MAN_DIR=${PCP_MAN_DIR:-"/usr/man"} PCP_VAR_DIR=${PCP_VAR_DIR:-"/var/pcp"} PCP_PMCDCONF_PATH=${PCP_PMCDCONF_PATH:-"/var/pcp/config/pmcd/pmcd.conf"} PCP_PMCDOPTIONS_PATH=${PCP_PMCDOPTIONS_PATH:-"/var/pcp/config/pmcd/pmcd.options"} PCP_PMDAS_DIR=${PCP_PMDAS_DIR:-"/var/pcp/pmdas"} PCP_LOG_DIR=${PCP_LOG_DIR:-"/var/log/pcp"} PCP_TMP_DIR=${PCP_TMP_DIR:-"/var/tmp"} PCP_DOC_DIR=${PCP_DOC_DIR:-"/usr/doc/pcp-2.1.10"} PCP_DEMOS_DIR=${PCP_DEMOS_DIR:-"/usr/share/pcp/demos"} PCP_MAGIC_FILE=${PCP_MAGIC_FILE:-"/usr/share/magic"} PCP_AWK_PROG=${PCP_AWK_PROG:-"gawk"} PCP_CPP_PROG=${PCP_CPP_PROG:-"/lib/cpp -P -traditional -undef"} PCP_PS_HAVE_BSD=${PCP_PS_HAVE_BSD:-"false"} PCP_PS_ALL_FLAGS=${PCP_PS_ALL_FLAGS:-"-efw"} PCP_PLATFORM=${PCP_PLATFORM:-"linux"} PCP_VERSION=${PCP_VERSION:-"2.1.10-8"} PCP_XCONFIRM_PROG=${PCP_XCONFIRM_PROG:-"/usr/share/pcp/lib/xconfirm"} export PCP_RC_DIR PCP_SYSCONFIG_DIR PCP_BIN_DIR PCP_BINADM_DIR PCP_LIB_DIR PCP_SHARE_DIR PCP_INC_DIR PCP_MAN_DIR PCP_VAR_DIR PCP_PMCDCONF_PATH PCP_PMCDOPTIONS_PATH PCP_PMDAS_DIR PCP_LOG_DIR PCP_TMP_DIR PCP_DOC_DIR PCP_DEMOS_DIR PCP_MAGIC_FILE PCP_AWK_PROG PCP_CPP_PROG PCP_PS_HAVE_BSD PCP_PS_ALL_FLAGS PCP_PLATFORM PCP_VERSION PCP_XCONFIRM_PROG ++++ PCP_RC_DIR=/etc/rc.d/init.d ++++ PCP_SYSCONFIG_DIR=/etc/sysconfig ++++ PCP_BIN_DIR=/usr/bin ++++ PCP_BINADM_DIR=/usr/share/pcp/bin ++++ PCP_LIB_DIR=/usr/lib ++++ PCP_SHARE_DIR=/usr/share/pcp ++++ PCP_INC_DIR=/usr/include/pcp ++++ PCP_MAN_DIR=/usr/man ++++ PCP_VAR_DIR=/var/pcp ++++ PCP_PMCDCONF_PATH=/var/pcp/config/pmcd/pmcd.conf ++++ PCP_PMCDOPTIONS_PATH=/var/pcp/config/pmcd/pmcd.options ++++ PCP_PMDAS_DIR=/var/pcp/pmdas ++++ PCP_LOG_DIR=/var/log/pcp ++++ PCP_TMP_DIR=/var/tmp ++++ PCP_DOC_DIR=/usr/doc/pcp-2.1.10 ++++ PCP_DEMOS_DIR=/usr/share/pcp/demos ++++ PCP_MAGIC_FILE=/usr/share/magic ++++ PCP_AWK_PROG=gawk ++++ PCP_CPP_PROG=/lib/cpp -P -traditional -undef ++++ PCP_PS_HAVE_BSD=false ++++ PCP_PS_ALL_FLAGS=-efw ++++ PCP_PLATFORM=linux ++++ PCP_VERSION=2.1.10-8 ++++ PCP_XCONFIRM_PROG=/usr/share/pcp/lib/xconfirm ++++ export PCP_RC_DIR PCP_SYSCONFIG_DIR PCP_BIN_DIR PCP_BINADM_DIR PCP_LIB_DIR PCP_SHARE_DIR PCP_INC_DIR PCP_MAN_DIR PCP_VAR_DIR PCP_PMCDCONF_PATH PCP_PMCDOPTIONS_PATH PCP_PMDAS_DIR PCP_LOG_DIR PCP_TMP_DIR PCP_DOC_DIR PCP_DEMOS_DIR PCP_MAGIC_FILE PCP_AWK_PROG PCP_CPP_PROG PCP_PS_HAVE_BSD PCP_PS_ALL_FLAGS PCP_PLATFORM PCP_VERSION PCP_XCONFIRM_PROG +++ PATH=/usr/sbin:/sbin:/bin:/usr/bin:/usr/bsd:/etc:/usr/etc:/usr/bin:/usr/share/pcp/bin:/usr/share/pcp/bin:/usr/bin/X11 +++ export PATH + unset PCP_STDERR + tmp=/tmp/17751 + status=0 + echo + trap rm -f `[ -f /tmp/17751.lock ] && cat /tmp/17751.lock` /tmp/17751.*; exit $status 0 1 2 3 15 ++ basename /usr/share/pcp/bin/pmlogger_check + prog=pmlogger_check + CONTROL=/var/pcp/config/pmlogger/control ++ hostname + LOCALHOSTNAME=lanner.ncsa.uiuc.edu + [ -z lanner.ncsa.uiuc.edu ] ++ which pwd ++ gawk BEGIN { i = 0 } / not in / { i = 1 } / aliased to / { i = 1 } { if ( i == 0 ) print } + PWDCMND=/bin/pwd + [ X = X ] + PWDCMND=/bin/pwd + logfile=pmlogger.log + SHOWME=false + MV=mv + VERBOSE=false + VERY_VERBOSE=false + usage=Usage: pmlogger_check [-NV] [-c control] + getopts c:NV? c + CONTROL=/var/pcp/config/pmlogger/control.cluster + getopts c:NV? c ++ expr 3 - 1 + shift 2 + [ 0 -ne 0 ] + [ ! -f /var/pcp/config/pmlogger/control.cluster ] + version=1.0 + echo + rm -f /tmp/17751.err + line=0 + cat /var/pcp/config/pmlogger/control.cluster + sed -e s/LOCALHOSTNAME/lanner.ncsa.uiuc.edu/g -e s;PCP_LOG_DIR;/var/log/pcp;g + read host primary socks dir args ++ expr 0 + 1 + line=1 + [ 1.0 = 1.0 ] + args=/var/log/pcp/pmlogger/ntsc1033 -c config.cluster + dir=n + socks=n + [ -z n -o -z n -o -z n -o -z /var/log/pcp/pmlogger/ntsc1033 -c config.cluster ] + false + [ ! -d n ] + mkdir -p n + [ ! -d n ] + _warning creating directory (n) for PCP archive files + echo pmlogger_check [/var/pcp/config/pmlogger/control.cluster:1] pmlogger_check [/var/pcp/config/pmlogger/control.cluster:1] + echo Warning: creating directory (n) for PCP archive files Warning: creating directory (n) for PCP archive files + [ ! -d n ] ++ grep n /tmp/17751.dir + [ = n ] + echo n + cd n ++ /bin/pwd + dir=/root/n + false + [ ! -w /root/n ] + fail=true + rm -f /tmp/17751.stamp + pmlock -v lock + echo /root/n/lock + fail=false + break + false + pid= + [ Xn = Xy ] ++ pmhostname ntsc1033 + fqdn=ntsc1033.ncsa.uiuc.edu + [ /var/tmp/pmlogger/[0-9]* = /var/tmp/pmlogger/[0-9]* ] + continue + [ -z ] + rm -f Latest + [ Xn = Xy ] + args=-h ntsc1033 /var/log/pcp/pmlogger/ntsc1033 -c config.cluster + iam= ++ date +%Y%m%d.%H.%M + LOGNAME=20001031.15.14 + suff= + [ 20001031.15.14.* = 20001031.15.14.* ] + continue + false + sock_me= + [ n = y ] + _get_logfile + want=false + false + false + false + false + false + [ -f pmlogger.log ] + false + pmlogger -h ntsc1033 /var/log/pcp/pmlogger/ntsc1033 -c config.cluster 20001031.15.14 + pid=17795 + _check_logger 17795 + false + delay=5 + [ ! -z ] + x=5 + [ ! -z ] ++ expr 5 + 20 * 5 + delay=105 + i=0 + [ 0 -lt 105 ] + false + [ -f pmlogger.log ] + sleep 5 ++ expr 0 + 5 + i=5 + [ 5 -lt 105 ] + false + [ -f pmlogger.log ] + sleep 5 ++ expr 5 + 5 + i=10 + [ 10 -lt 105 ] + false + [ -f pmlogger.log ] + sleep 5 ++ expr 10 + 5 + i=15 + [ 15 -lt 105 ] + false + [ -f pmlogger.log ] + sleep 5 ++ expr 15 + 5 + i=20 + [ 20 -lt 105 ] + false + [ -f pmlogger.log ] + sleep 5 ++ expr 20 + 5 + i=25 + [ 25 -lt 105 ] + false + [ -f pmlogger.log ] + sleep 5 ++ expr 25 + 5 + i=30 + [ 30 -lt 105 ] + false + [ -f pmlogger.log ] + sleep 5 ++ expr 30 + 5 + i=35 + [ 35 -lt 105 ] + false + [ -f pmlogger.log ] + sleep 5 ++ expr 35 + 5 + i=40 + [ 40 -lt 105 ] + false + [ -f pmlogger.log ] + sleep 5 ++ expr 40 + 5 + i=45 + [ 45 -lt 105 ] + false + [ -f pmlogger.log ] + sleep 5 ++ expr 45 + 5 + i=50 + [ 50 -lt 105 ] + false + [ -f pmlogger.log ] + sleep 5 ++ expr 50 + 5 + i=55 + [ 55 -lt 105 ] + false + [ -f pmlogger.log ] + sleep 5 ++ expr 55 + 5 + i=60 + [ 60 -lt 105 ] + false + [ -f pmlogger.log ] + sleep 5 ++ expr 60 + 5 + i=65 + [ 65 -lt 105 ] + false + [ -f pmlogger.log ] + sleep 5 ++ expr 65 + 5 + i=70 + [ 70 -lt 105 ] + false + [ -f pmlogger.log ] + sleep 5 ++ expr 70 + 5 + i=75 + [ 75 -lt 105 ] + false + [ -f pmlogger.log ] + sleep 5 ++ expr 75 + 5 + i=80 + [ 80 -lt 105 ] + false + [ -f pmlogger.log ] + sleep 5 ++ expr 80 + 5 + i=85 + [ 85 -lt 105 ] + false + [ -f pmlogger.log ] + sleep 5 ++ expr 85 + 5 + i=90 + [ 90 -lt 105 ] + false + [ -f pmlogger.log ] + sleep 5 ++ expr 90 + 5 + i=95 + [ 95 -lt 105 ] + false + [ -f pmlogger.log ] + sleep 5 ++ expr 95 + 5 + i=100 + [ 100 -lt 105 ] + false + [ -f pmlogger.log ] + sleep 5 ++ expr 100 + 5 + i=105 + [ 105 -lt 105 ] + false + _message restart + echo -n Restarting pmlogger for host "ntsc1033" ... Restarting pmlogger for host "ntsc1033" ...+ echo timed out waiting! timed out waiting! + sed -e s/^/ / /tmp/17751.out Usage: pmlogger [options] archive Options: -c configfile file to load configuration from -h host metrics source is PMCD on host -l logfile redirect diagnostics and trace output -L linger, even if not primary logger instance and nothing to log -n pmnsfile use an alternative PMNS -P execute as primary logger instance -r report record sizes and archive growth rate -s endsize terminate after endsize has been accumulated -t interval default logging interval [default 60.0 seconds] -T endtime terminate at given time -v volsize switch log volumes after volsize has been accumulated -V version generate version 1 or 2 archives (default is 2) -x fd control file descriptor for application launching pmlogger via pmRecordControl(3) + _check_logfile + [ ! -f pmlogger.log ] + echo pmlogger_check: Error: cannot find pmlogger output file at "pmlogger.log" pmlogger_check: Error: cannot find pmlogger output file at "pmlogger.log" ++ dirname pmlogger.log + logdir=. ++ cd . ++ /bin/pwd + echo Directory (/root/n) contents: Directory (/root/n) contents: + ls -la . total 8 drwxrwxr-x 2 root root 4096 Oct 31 15:14 . drwxr-x--- 6 root root 4096 Oct 31 15:14 .. ---------- 1 root root 0 Oct 31 15:14 lock + return 1 + [ -f 20001031.15.14.0 ] + echo pmlogger_check: Error: archive file 20001031.15.14.0 missing pmlogger_check: Error: archive file 20001031.15.14.0 missing ++ dirname 20001031.15.14 + logdir=. ++ cd . ++ /bin/pwd + echo Directory (/root/n) contents: Directory (/root/n) contents: + ls -la . total 8 drwxrwxr-x 2 root root 4096 Oct 31 15:14 . drwxr-x--- 6 root root 4096 Oct 31 15:14 .. ---------- 1 root root 0 Oct 31 15:14 lock + _unlock + rm -f lock + echo + read host primary socks dir args ++ expr 1 + 1 + line=2 + [ 1.0 = 1.0 ] + args=/var/log/pcp/pmlogger/ntsc1034 -c config.cluster + dir=n + socks=n + [ -z n -o -z n -o -z n -o -z /var/log/pcp/pmlogger/ntsc1034 -c config.cluster ] + false + [ ! -d n ] + mkdir -p n + [ ! -d n ] + _warning creating directory (n) for PCP archive files + echo pmlogger_check [/var/pcp/config/pmlogger/control.cluster:2] pmlogger_check [/var/pcp/config/pmlogger/control.cluster:2] + echo Warning: creating directory (n) for PCP archive files Warning: creating directory (n) for PCP archive files + [ ! -d n ] ++ grep n /tmp/17751.dir + [ n = n ] + _error Cannot start more than one pmlogger instance for archive directory "n" + echo pmlogger_check: [/var/pcp/config/pmlogger/control.cluster:2] pmlogger_check: [/var/pcp/config/pmlogger/control.cluster:2] + echo Error: Cannot start more than one pmlogger instance for archive directory "n" Error: Cannot start more than one pmlogger instance for archive directory "n" + echo ... logging for host "ntsc1034" unchanged ... logging for host "ntsc1034" unchanged + touch /tmp/17751.err + continue + read host primary socks dir args ++ expr 2 + 1 + line=3 + [ 1.0 = 1.0 ] + args=/var/log/pcp/pmlogger/ntsc1035 -c config.cluster + dir=n + socks=n + [ -z n -o -z n -o -z n -o -z /var/log/pcp/pmlogger/ntsc1035 -c config.cluster ] + false + [ ! -d n ] + [ ! -d n ] ++ grep n /tmp/17751.dir + [ n = n ] + _error Cannot start more than one pmlogger instance for archive directory "n" + echo pmlogger_check: [/var/pcp/config/pmlogger/control.cluster:3] pmlogger_check: [/var/pcp/config/pmlogger/control.cluster:3] + echo Error: Cannot start more than one pmlogger instance for archive directory "n" Error: Cannot start more than one pmlogger instance for archive directory "n" + echo ... logging for host "ntsc1035" unchanged ... logging for host "ntsc1035" unchanged + touch /tmp/17751.err + continue + read host primary socks dir args + [ -f /tmp/17751.err ] + status=1 + exit ++ [ -f /tmp/17751.lock ] ++ cat /tmp/17751.lock + rm -f /tmp/17751.dir /tmp/17751.err /tmp/17751.lock /tmp/17751.out + exit 1 From owner-pcp@oss.sgi.com Tue Oct 31 13:50:33 2000 Received: by oss.sgi.com id ; Tue, 31 Oct 2000 13:50:13 -0800 Received: from deliverator.sgi.com ([204.94.214.10]:37983 "EHLO deliverator.sgi.com") by oss.sgi.com with ESMTP id ; Tue, 31 Oct 2000 13:50:08 -0800 Received: from rattle.melbourne.sgi.com (rattle.melbourne.sgi.com [134.14.55.145]) by deliverator.sgi.com (980309.SGI.8.8.8-aspam-6.2/980310.SGI-aspam) via ESMTP id NAA11982 for ; Tue, 31 Oct 2000 13:42:17 -0800 (PST) mail_from (kenmcd@melbourne.sgi.com) Received: from localhost (kenmcd@localhost) by rattle.melbourne.sgi.com (SGI-8.9.3/8.9.3) with ESMTP id IAA71261; Wed, 1 Nov 2000 08:48:48 +1100 (AEDT) X-Authentication-Warning: rattle.melbourne.sgi.com: kenmcd owned process doing -bs Date: Wed, 1 Nov 2000 08:48:48 +1100 From: Ken McDonell Reply-To: kenmcd@sgi.com To: Alan Bailey cc: pcp@oss.sgi.com, Mike Gigante Subject: Re: pmlogger control file In-Reply-To: Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-pcp@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;pcp-outgoing Content-Length: 996 Lines: 30 On Tue, 31 Oct 2000, Alan Bailey wrote: > > Could you please send me your /var/pcp/config/pmlogger/control.cluster > > file as an attachment so I can check for whitespace strangeness? Whitespace is not the problem, but the version is. You need this line in the start of control.cluster $version=1.1 The man page for pmlogger_check(1) fails to mention this, but the sample we ship in /var/pcp/config/pmlogger/control includes this and the preceding comment ... # DO NOT REMOVE OR EDIT THE FOLLOWING LINE Sorry. The reasons for this are not quite lost in the mists of antiquity, but relate to an earlier version of the control file (long before any open source release) that did not include a version number and did not include the "socks" field ... for backwards compatibility, the simplest solution was - expect a $version line, else no $version line means version 1.0 I'll fix the man page for the next release. ps. Mike Gigante, could you please check this in the PCP_UAG book. From owner-pcp@oss.sgi.com Tue Oct 31 13:57:03 2000 Received: by oss.sgi.com id ; Tue, 31 Oct 2000 13:56:43 -0800 Received: from ex1.ncsa.uiuc.edu ([141.142.2.9]:35269 "EHLO ex1.ncsa.uiuc.edu") by oss.sgi.com with ESMTP id ; Tue, 31 Oct 2000 13:56:36 -0800 Received: from mx1.ncsa.uiuc.edu (mx1.ncsa.uiuc.edu [141.142.2.8]) by ex1.ncsa.uiuc.edu (8.11.0/8.11.0) with ESMTP id e9VLuUJ27195; Tue, 31 Oct 2000 15:56:30 -0600 (CST) Received: from osage.ncsa.uiuc.edu (osage.ncsa.uiuc.edu [141.142.2.56]) by mx1.ncsa.uiuc.edu (8.11.0/8.11.0) with ESMTP id e9VLuU302858; Tue, 31 Oct 2000 15:56:30 -0600 (CST) Received: from localhost (abailey@localhost) by osage.ncsa.uiuc.edu (8.9.3/8.9.3) with ESMTP id PAA15267; Tue, 31 Oct 2000 15:56:29 -0600 Date: Tue, 31 Oct 2000 15:56:29 -0600 (CST) From: Alan Bailey To: kenmcd@sgi.com cc: pcp@oss.sgi.com Subject: Re: pmlogger control file In-Reply-To: Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-pcp@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;pcp-outgoing Content-Length: 1240 Lines: 41 Ah! Thanks for pointing that out. I should have looked closer at the sample file, I read the man pages too much ;) Alan On Wed, 1 Nov 2000, Ken McDonell wrote: > On Tue, 31 Oct 2000, Alan Bailey wrote: > > > > Could you please send me your /var/pcp/config/pmlogger/control.cluster > > > file as an attachment so I can check for whitespace strangeness? > > Whitespace is not the problem, but the version is. > > You need this line in the start of control.cluster > > $version=1.1 > > The man page for pmlogger_check(1) fails to mention this, but the > sample we ship in /var/pcp/config/pmlogger/control includes > this and the preceding comment ... > > # DO NOT REMOVE OR EDIT THE FOLLOWING LINE > > Sorry. > > The reasons for this are not quite lost in the mists of antiquity, but > relate to an earlier version of the control file (long before any open > source release) that did not include a version number and did not > include the "socks" field ... for backwards compatibility, the simplest > solution was > - expect a $version line, else no $version line means version 1.0 > > I'll fix the man page for the next release. > > ps. Mike Gigante, could you please check this in the PCP_UAG book. > -- Alan Bailey