From markgw@sgi.com Tue Apr 1 21:26:58 2003 Received: with ECARTIS (v1.0.0; list pcp); Tue, 01 Apr 2003 21:27:08 -0800 (PST) Received: from zok.sgi.com (zok.SGI.COM [204.94.215.101]) by oss.sgi.com (8.12.3/8.12.5) with SMTP id h325QwrX013737 for ; Tue, 1 Apr 2003 21:26:58 -0800 Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by zok.sgi.com (8.12.2/8.12.2/linux-outbound_gateway-1.2) with SMTP id h325QpuY002606 for ; Tue, 1 Apr 2003 21:26:52 -0800 Received: from sherman.melbourne.sgi.com (sherman.melbourne.sgi.com [134.14.55.232]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id PAA08098 for ; Wed, 2 Apr 2003 15:25:36 +1000 Date: Wed, 2 Apr 2003 15:25:36 +1000 (EST) From: Mark Goodwin X-X-Sender: markgw@sherman.melbourne.sgi.com To: pcp@oss.sgi.com Subject: Re: [announce] PCP 2.3.0-14 available In-Reply-To: Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) X-archive-position: 250 X-ecartis-version: Ecartis v1.0.0 Sender: pcp-bounce@oss.sgi.com Errors-to: pcp-bounce@oss.sgi.com X-original-sender: markgw@sgi.com Precedence: bulk X-list: pcp The pcp-2.3.0-14 release has now been moved to the stable download directory ftp://oss.sgi.com/projects/pcp/download and the web pages have been updated. In particular, the CHANGELOG since the previous stable release (2.2.2) is on-line at http://oss.sgi.com/projects/pcp/latest.html and there is some news at http://oss.sgi.com/projects/pcp/news.html We'll be moving ahead with a new dev release soon. Thanks -- Mark On Mon, 3 Mar 2003, Mark Goodwin wrote: > > SGI is pleased to announce the next pre-release version 2.3.0-14 of > Performance Co-Pilot (PCP) open source is now available for download from: > > ftp://oss.sgi.com/projects/pcp/download/dev > > This version 2.3.0-14 will become the current stable version in a few > days time - it has been extensively tested and passes QA. Thanks to > those who contributed, in particular Ken McDonell, Troy Dawson and Mike > Mason who all worked to nail the ksyms bug that was seg faulting on some > platforms but not others, and to Zhang Sonic for his QA work. This version > should also build and run on Solaris thanks to contributions from Alan > Hoyt. Alan is also working on a Solaris platform PMDA. > > There are re-built RPMs for i386 and ia64 platforms in the above ftp > directory. Other platforms will need to build RPMs from either the SRPM > or from the tarball, e.g. : > # tar xvzf pcp-2.3.0.src.tar.gz > # cd pcp-2.3.0 > # ./Makepkgs > > Please report any problems to the list. > > Thanks > -- Mark Goodwin > SGI Engineering. > > From kenmcd@melbourne.sgi.com Sat Apr 12 16:12:43 2003 Received: with ECARTIS (v1.0.0; list pcp); Sat, 12 Apr 2003 16:12:49 -0700 (PDT) Received: from zok.sgi.com (zok.SGI.COM [204.94.215.101]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h3CNChFu018636 for ; Sat, 12 Apr 2003 16:12:43 -0700 Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by zok.sgi.com (8.12.9/8.12.2/linux-outbound_gateway-1.2) with SMTP id h3CNCaVV019169 for ; Sat, 12 Apr 2003 16:12:37 -0700 Received: from kenj-ppp-a.melbourne.sgi.com (ppp-kenmcd.melbourne.sgi.com [134.14.52.219]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id JAA04813; Sun, 13 Apr 2003 09:11:17 +1000 Date: Sun, 13 Apr 2003 09:11:05 +1000 (EST) From: kenmcd@melbourne.sgi.com Reply-To: kenmcd@melbourne.sgi.com To: "Davis, Todd C" cc: "'pcp@oss.sgi.com'" Subject: Re: disk.dev.avactive and disk.dev.aveq on RH Advanced Server In-Reply-To: <29AD895CE780D511A8870002A50A666D04F90983@hdsmsx106.hd.intel.com> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) X-archive-position: 252 X-ecartis-version: Ecartis v1.0.0 Sender: pcp-bounce@oss.sgi.com Errors-to: pcp-bounce@oss.sgi.com X-original-sender: kenmcd@melbourne.sgi.com Precedence: bulk X-list: pcp Content-Length: 3735 Lines: 124 Apart from the scaling problem in the Linux PMDA that Todd reported later with disk.dev.avactive and disk.dev.aveq, I am unaware of any problems with these metrics. Since the time metrics are in units of msec, these numbers ... > Average_disk_utilization (Tue Mar 18 11:09:09 2003): > localhost: [sda] 10.0 > Average_queue_length (Tue Mar 18 11:09:09 2003): > localhost: [sda] 30.1 are time utilizations, so it looks like the numbers here are before the fix to remove the incorrect * 1000 / HZ scaling in the PMDA (the active number reported here should be in the range 0.0 to 1.0). It looks to me as though your sard patch may well have some sort of race condition where it "loses" the completion processing for one or more (I'd guess 3 given the aveq value) I/Os. We have seen no evidence on our local 2.4.17, 2.4.18 and 2.4.20 kernels ... but we are not running RH AS. On Tue, 18 Mar 2003, Davis, Todd C wrote: > > With no disk activity on the system I see disk.dev.avactive and > disk.dev.aveq on the root drive. The last sample had some disk io but the > disk.dev.avactive number did not change and disk.dev.aveq number did not > change significantly. Are these metrics supposed to me accurate? They look > bogus to me. > > I am running RetHat Advanced Sever with a 2.4.18 kernel with the sard patch > applied. > > The script: > > pmie -f -e -V < // > // Watch average disk utilization and average queue length > // > myhost = "localhost"; // the host of interest > delta = 3 sec; > Block_total = > disk.dev.blktotal :\$myhost; > Average_disk_utilization = > disk.dev.avactive :\$myhost; > Average_queue_length = > disk.dev.aveq :\$myhost; > > pmie.end > > The output: > > Block_total (Tue Mar 18 11:09:06 2003): ? ? > Average_disk_utilization (Tue Mar 18 11:09:06 2003): ? ? > Average_queue_length (Tue Mar 18 11:09:06 2003): ? ? > > Block_total (Tue Mar 18 11:09:09 2003): > localhost: [sda] 0 > localhost: [sdb] 0 > Average_disk_utilization (Tue Mar 18 11:09:09 2003): > localhost: [sda] 10.0 > localhost: [sdb] 0 > Average_queue_length (Tue Mar 18 11:09:09 2003): > localhost: [sda] 30.1 > localhost: [sdb] 0 > > Block_total (Tue Mar 18 11:09:12 2003): > localhost: [sda] 0 > localhost: [sdb] 0 > Average_disk_utilization (Tue Mar 18 11:09:12 2003): > localhost: [sda] 10.0 > localhost: [sdb] 0 > Average_queue_length (Tue Mar 18 11:09:12 2003): > localhost: [sda] 30.0 > localhost: [sdb] 0 > > Block_total (Tue Mar 18 11:09:15 2003): > localhost: [sda] 0 > localhost: [sdb] 0 > Average_disk_utilization (Tue Mar 18 11:09:15 2003): > localhost: [sda] 10.0 > localhost: [sdb] 0 > Average_queue_length (Tue Mar 18 11:09:15 2003): > localhost: [sda] 30.0 > localhost: [sdb] 0 > > Block_total (Tue Mar 18 11:09:18 2003): > localhost: [sda] 0 > localhost: [sdb] 0 > Average_disk_utilization (Tue Mar 18 11:09:18 2003): > localhost: [sda] 10.0 > localhost: [sdb] 0 > Average_queue_length (Tue Mar 18 11:09:18 2003): > localhost: [sda] 30.0 > localhost: [sdb] 0 > > Block_total (Tue Mar 18 11:09:21 2003): > localhost: [sda] 85 > localhost: [sdb] 0 > Average_disk_utilization (Tue Mar 18 11:09:21 2003): > localhost: [sda] 10.0 > localhost: [sdb] 0 > Average_queue_length (Tue Mar 18 11:09:21 2003): > localhost: [sda] 30.9 > localhost: [sdb] 0 > > Todd C. Davis > These are my opinions and absolutely not official opinions of Intel Corp. > Telco Systems Development > Intel Corporation, Columbia Design Center > CBA-2, Suite 100 > 250 Berry Hill Road > Columbia, SC 29210 > (803) 461-6108 > fax:: (803) 461-6292 > mailto:todd.c.davis@intel.com > > > From kenmcd@melbourne.sgi.com Sat Apr 12 16:12:56 2003 Received: with ECARTIS (v1.0.0; list pcp); Sat, 12 Apr 2003 16:12:59 -0700 (PDT) Received: from tolkor.sgi.com ([198.149.18.6]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h3CNCtFu018644 for ; Sat, 12 Apr 2003 16:12:56 -0700 Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by tolkor.sgi.com (8.12.9/8.12.2/linux-outbound_gateway-1.2) with SMTP id h3CNPaVe011976 for ; Sat, 12 Apr 2003 18:25:37 -0500 Received: from kenj-ppp-a.melbourne.sgi.com (ppp-kenmcd.melbourne.sgi.com [134.14.52.219]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id JAA04816; Sun, 13 Apr 2003 09:11:29 +1000 Date: Sun, 13 Apr 2003 09:11:18 +1000 (EST) From: kenmcd@melbourne.sgi.com Reply-To: kenmcd@melbourne.sgi.com To: "Davis, Todd C" cc: "'pcp@oss.sgi.com'" Subject: RE: disk.dev.avactive and disk.dev.aveq on RH Advanced Server In-Reply-To: <29AD895CE780D511A8870002A50A666D04F90988@hdsmsx106.hd.intel.com> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) X-archive-position: 253 X-ecartis-version: Ecartis v1.0.0 Sender: pcp-bounce@oss.sgi.com Errors-to: pcp-bounce@oss.sgi.com X-original-sender: kenmcd@melbourne.sgi.com Precedence: bulk X-list: pcp Content-Length: 1534 Lines: 50 On Wed, 19 Mar 2003, Davis, Todd C wrote: > I think I found part of the bug for these metrics. The metrics are being > converted to milliseconds twice, in the sard patch and in the Linux pmda: Well spotted. The reason we had not found this earlier is that most of our Linux testing has been done on machines where HZ is 1024, so * 1000 / 1024 introduces only a very small error. Of course if HZ is 100, this error makes the numbers too big by a factor of 10. > >From sard patch: > +#define MSEC(x) ((x) * 1000 / HZ) > + MSEC(hd->rd_ticks), > + hd->wr_ios, hd->wr_merges, > + hd->wr_sectors, > + MSEC(hd->wr_ticks), > + hd->ios_in_flight, > + MSEC(hd->io_ticks), > + MSEC(hd->aveq)); > +#undef MSEC > > >From the Linux pmda: > case 46: /* disk.dev.avactive */ > atom->ul = 1000 * p->io_ticks / proc_stat.hz; > break; > case 47: /* disk.dev.aveq */ > atom->ul = 1000 * p->aveq / proc_stat.hz; > break; > > > case 44: /* disk.all.avactive */ > atom->ull += 1000 * p->io_ticks / proc_stat.hz; > break; > case 45: /* disk.all.aveq */ > atom->ull += 1000 * p->aveq / proc_stat.hz; > break; > > Also shouldn't all the metrics be ull? Not necessarily. The raw data from /proc/partitions is a 32-bit number, hence the _per disk_ metrics are ul. Once we start adding these together to produce the _all_ metrics, we need ull to avoid overflows (within SGI we often have multiple hundreds of disks on one Linux system, so this is a real issue for us). From dawson@fnal.gov Wed Apr 23 08:02:41 2003 Received: with ECARTIS (v1.0.0; list pcp); Wed, 23 Apr 2003 08:02:49 -0700 (PDT) Received: from heffalump.fnal.gov (heffalump.fnal.gov [131.225.9.20]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h3NF2eFu007973 for ; Wed, 23 Apr 2003 08:02:41 -0700 Received: from fnal.gov (thebrain.fnal.gov [131.225.80.75]) by heffalump.fnal.gov (iPlanet Messaging Server 5.2 HotFix 1.10 (built Jan 23 2003)) with ESMTP id <0HDS00KF2XSGDG@heffalump.fnal.gov> for pcp@oss.sgi.com; Wed, 23 Apr 2003 10:02:40 -0500 (CDT) Date: Wed, 23 Apr 2003 10:02:40 -0500 From: Troy Dawson Subject: PCP on RedHat 9 To: pcp@oss.sgi.com Message-id: <3EA6AB10.9050203@fnal.gov> MIME-version: 1.0 Content-type: text/plain; charset=us-ascii; format=flowed Content-transfer-encoding: 7BIT X-Accept-Language: en-us, en User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.3) Gecko/20030314 X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) X-archive-position: 254 X-ecartis-version: Ecartis v1.0.0 Sender: pcp-bounce@oss.sgi.com Errors-to: pcp-bounce@oss.sgi.com X-original-sender: dawson@fnal.gov Precedence: bulk X-list: pcp Content-Length: 740 Lines: 19 Howdy, I'm having a hard time recompiling pcp 2.3 on RedHat 9. It's because of that blasted 'errno' problem that alot of other programs are having on RedHat 9. (Boy when they said they broke binary compatability, they weren't kidding.) Am I the only person who's seen this, or am I just the first to report it. I've tried just putting the #include in various files, but I never seemed to find them all, and then eventually I broke something. Has anyone been able to get this working yet, or should I keep on plugging away? Troy -- __________________________________________________ Troy Dawson dawson@fnal.gov (630)840-6468 Fermilab ComputingDivision/OSS CSI Group __________________________________________________ From dawson@fnal.gov Wed Apr 23 08:18:53 2003 Received: with ECARTIS (v1.0.0; list pcp); Wed, 23 Apr 2003 08:18:59 -0700 (PDT) Received: from woozle.fnal.gov (woozle.fnal.gov [131.225.9.22]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h3NFIqFu008533 for ; Wed, 23 Apr 2003 08:18:53 -0700 Received: from fnal.gov (thebrain.fnal.gov [131.225.80.75]) by woozle.fnal.gov (iPlanet Messaging Server 5.2 HotFix 1.10 (built Jan 23 2003)) with ESMTP id <0HDS00LLSYJGJ9@woozle.fnal.gov> for pcp@oss.sgi.com; Wed, 23 Apr 2003 10:18:52 -0500 (CDT) Date: Wed, 23 Apr 2003 10:18:52 -0500 From: Troy Dawson Subject: pcpmon and pcp-pro To: pcp@oss.sgi.com Message-id: <3EA6AEDC.7060706@fnal.gov> MIME-version: 1.0 Content-type: text/plain; charset=us-ascii; format=flowed Content-transfer-encoding: 7BIT X-Accept-Language: en-us, en User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.3) Gecko/20030314 X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) X-archive-position: 255 X-ecartis-version: Ecartis v1.0.0 Sender: pcp-bounce@oss.sgi.com Errors-to: pcp-bounce@oss.sgi.com X-original-sender: dawson@fnal.gov Precedence: bulk X-list: pcp Content-Length: 1181 Lines: 26 Howdy again, My old version of pcp-pro didn't work very well on pcp 2.3. (I'm not complaining, just stating) So I tried out the latest pcpmon, which worked very well. The only problem is that all the web pages I found pointing at pcpmon's web site, didn't work. Is pcpmon still being developed? If so, is there a web page? If not, is anyone planning on taking it up again? On a related subject, could you all that work at SGI get after your marketing people again and ask them to let normal companies buy pcp-pro. They have a product that people want, but they won't sell it to them. This, I'm sure is having a affect on how well pcp get's used in the worldwide community. I've talked to our SGI hardware guy here, he said he'd talk to some people. But if ya'll could mention it again it would be helpful. Thanks Troy p.s. Yes, I got my original version of pcp-pro legally, but when we tried to buy some more copies, or upgrade, they hum and hahed and finolly said no. -- __________________________________________________ Troy Dawson dawson@fnal.gov (630)840-6468 Fermilab ComputingDivision/OSS CSI Group __________________________________________________ From tichi404@yahoo.com Wed Apr 23 09:42:22 2003 Received: with ECARTIS (v1.0.0; list pcp); Wed, 23 Apr 2003 09:42:27 -0700 (PDT) Received: from web13308.mail.yahoo.com (web13308.mail.yahoo.com [216.136.175.44]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h3NGgLFu023412 for ; Wed, 23 Apr 2003 09:42:22 -0700 Message-ID: <20030423164220.149.qmail@web13308.mail.yahoo.com> Received: from [207.250.3.126] by web13308.mail.yahoo.com via HTTP; Wed, 23 Apr 2003 09:42:20 PDT Date: Wed, 23 Apr 2003 09:42:20 -0700 (PDT) From: ti chi Subject: Re: pcpmon and pcp-pro To: Troy Dawson , pcp@oss.sgi.com In-Reply-To: <3EA6AEDC.7060706@fnal.gov> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) X-archive-position: 256 X-ecartis-version: Ecartis v1.0.0 Sender: pcp-bounce@oss.sgi.com Errors-to: pcp-bounce@oss.sgi.com X-original-sender: tichi404@yahoo.com Precedence: bulk X-list: pcp Content-Length: 1009 Lines: 35 --- Troy Dawson wrote: > Howdy again, > My old version of pcp-pro didn't work very well on > pcp 2.3. (I'm not > complaining, just stating) So I tried out the > latest pcpmon, which worked > very well. > The only problem is that all the web pages I found > pointing at pcpmon's web > site, didn't work. > Is pcpmon still being developed? If so, is there a > web page? If not, is > anyone planning on taking it up again? I dont know but think someone else on list said to work on another gui but not see anything yet. > > On a related subject, could you all that work at SGI > get after your marketing > people again and ask them to let normal companies > buy pcp-pro. They have a > product that people want, but they won't sell it to > them. Hold you breath and you go blue! Not hopeful SGI helpful and pcp community not so big. --Ti __________________________________________________ Do you Yahoo!? The New Yahoo! Search - Faster. Easier. Bingo http://search.yahoo.com From nathans@wobbly.melbourne.sgi.com Mon Apr 28 00:10:33 2003 Received: with ECARTIS (v1.0.0; list pcp); Mon, 28 Apr 2003 00:10:53 -0700 (PDT) Received: from tolkor.sgi.com ([198.149.18.6]) by oss.sgi.com (8.12.9/8.12.9) with SMTP id h3S7AVFu026485 for ; Mon, 28 Apr 2003 00:10:32 -0700 Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by tolkor.sgi.com (8.12.9/8.12.2/linux-outbound_gateway-1.2) with SMTP id h3S7O4Ve005279 for ; Mon, 28 Apr 2003 02:24:05 -0500 Received: from wobbly.melbourne.sgi.com (wobbly.melbourne.sgi.com [134.14.55.135]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id RAA12746; Mon, 28 Apr 2003 17:09:03 +1000 Received: from frodo.melbourne.sgi.com (root@frodo.melbourne.sgi.com [134.14.55.153]) by wobbly.melbourne.sgi.com (SGI-8.9.3/8.9.3) with ESMTP id RAA29274; Mon, 28 Apr 2003 17:09:03 +1000 (EST) Received: from frodo.melbourne.sgi.com (nathans@localhost [127.0.0.1]) by frodo.melbourne.sgi.com (8.12.7/8.12.7/Debian-2) with ESMTP id h3S776jJ002058; Mon, 28 Apr 2003 17:07:06 +1000 Received: (from nathans@localhost) by frodo.melbourne.sgi.com (8.12.7/8.12.7/Debian-2) id h3S775ZI002056; Mon, 28 Apr 2003 17:07:05 +1000 Date: Mon, 28 Apr 2003 17:07:05 +1000 From: Nathan Scott To: Troy Dawson , markgw@sgi.com Cc: pcp@oss.sgi.com Subject: Re: PCP on RedHat 9 Message-ID: <20030428070705.GA2049@frodo> References: <3EA6AB10.9050203@fnal.gov> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <3EA6AB10.9050203@fnal.gov> User-Agent: Mutt/1.5.3i X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) X-archive-position: 257 X-ecartis-version: Ecartis v1.0.0 Sender: pcp-bounce@oss.sgi.com Errors-to: pcp-bounce@oss.sgi.com X-original-sender: nathans@sgi.com Precedence: bulk X-list: pcp Content-Length: 952 Lines: 24 On Wed, Apr 23, 2003 at 10:02:40AM -0500, Troy Dawson wrote: > Howdy, > I'm having a hard time recompiling pcp 2.3 on RedHat 9. It's because of > that blasted 'errno' problem that alot of other programs are having on > RedHat 9. (Boy when they said they broke binary compatability, they weren't > kidding.) > Am I the only person who's seen this, or am I just the first to report it. > > I've tried just putting the #include in various files, but I > never seemed to find them all, and then eventually I broke something. > > Has anyone been able to get this working yet, or should I keep on plugging > away? I have this working, and will checkin the changes shortly (and Mark is back in the office now, so I imagine new source should also appear on oss.sgi.com soon). I also have a bison version fix up my sleeve for pmie (for current Debian bison), and some updates to the XFS metrics in the pipeline as well. cheers. -- Nathan