From janfrode@tanso.net Mon Jun 4 20:13:20 2007 Received: with ECARTIS (v1.0.0; list pcp); Mon, 04 Jun 2007 23:04:19 -0700 (PDT) Received: from wx-out-0506.google.com (wx-out-0506.google.com [66.249.82.227]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id l553DIWt024134 for ; Mon, 4 Jun 2007 20:13:19 -0700 Received: by wx-out-0506.google.com with SMTP id s17so1498133wxc for ; Mon, 04 Jun 2007 20:13:18 -0700 (PDT) Received: by 10.70.11.5 with SMTP id 5mr8324810wxk.1181011530998; Mon, 04 Jun 2007 19:45:30 -0700 (PDT) Received: from lc4eb6380248654.ibm.com ( [85.19.196.182]) by mx.google.com with ESMTP id m6sm1397950wrm.2007.06.04.19.45.29; Mon, 04 Jun 2007 19:45:30 -0700 (PDT) Received: from lc4eb6380248654.ibm.com (localhost.localdomain [127.0.0.1]) by lc4eb6380248654.ibm.com (8.13.8/8.13.8) with ESMTP id l52KendF004089; Sat, 2 Jun 2007 22:40:49 +0200 Received: (from janfrode@localhost) by lc4eb6380248654.ibm.com (8.13.8/8.13.8/Submit) id l52KemJZ004086; Sat, 2 Jun 2007 22:40:48 +0200 X-Authentication-Warning: lc4eb6380248654.ibm.com: janfrode set sender to janfrode@tanso.net using -f Date: Sat, 2 Jun 2007 22:40:48 +0200 From: Jan-Frode Myklebust To: Nathan Scott Cc: pcp@oss.sgi.com Subject: Re: pmie spawning more than 1 instance per host Message-ID: <20070602204048.GA4067@lc4eb6380248654.ibm.com> References: <1180484426.6273.748.camel@edge> <20070530082218.GA6332@lc4eb6380248654.ibm.com> <1180589911.6273.770.camel@edge> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1180589911.6273.770.camel@edge> User-Agent: Mutt/1.4.2.2i X-archive-position: 1269 X-Approved-By: makc@sgi.com X-ecartis-version: Ecartis v1.0.0 Sender: pcp-bounce@oss.sgi.com Errors-to: pcp-bounce@oss.sgi.com X-original-sender: janfrode@tanso.net Precedence: bulk X-list: pcp On Thu, May 31, 2007 at 03:38:31PM +1000, Nathan Scott wrote: > > I've switched the script over to have these now, and also added > the additional "very verbose" (-V -V) diagnostics that the > pmlogger_check script has - could you try out the attached > script, in place of your current /usr/share/pcp/bin/pmie_check? I didn't replace /usr/share/pcp/bin/pmie_check, but rather put your script in /etc/cron.hourly/pmie_check.sh. Unfortunately it also leaks out new instances for already running pmie's. I killed all pmie's and restarted them using your script at 10:45AM, and everything was working fine until 8:00PM when the pmie_check.sh seems to have launched 5 duplicates: $ ps -ef|grep pmie | awk '{print $11}' | sort | uniq -c|grep -v " 1 " 2 dhcp1isp.mydomain.com 2 ldapm1.mydomain.com 2 ldapm2.mydomain.com 2 ns1.mydomain.com 2 tvservices.mydomain.com The duplicates were all started at 8:01-8:02PM. > > If you still see the problem with this script, can you capture > the ps -ef (ps -efw, ideally, cos thats what _get_pids_by_name > does) output, and also the contents of /var/tmp/pmie (if you > could make a tarball, that'd be great). And any additional > diagnostics (from using "-V -V" options) that give hints as to > why the pmie processes were started or not stopped. I'll send a tarball privately.. Thanks for helping out! -jf From nscott@aconex.com Thu Jun 7 16:58:57 2007 Received: with ECARTIS (v1.0.0; list pcp); Thu, 07 Jun 2007 16:59:02 -0700 (PDT) Received: from postoffice.aconex.com (mail.app.aconex.com [203.89.192.138]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id l57NwtWt001038 for ; Thu, 7 Jun 2007 16:58:57 -0700 Received: from edge.local (unknown [203.89.192.141]) by postoffice.aconex.com (Postfix) with ESMTP id CB3D292C08A; Fri, 8 Jun 2007 09:58:53 +1000 (EST) Subject: Re: pmie spawning more than 1 instance per host From: Nathan Scott Reply-To: nscott@aconex.com To: Jan-Frode Myklebust Cc: pcp@oss.sgi.com In-Reply-To: <20070602204048.GA4067@lc4eb6380248654.ibm.com> References: <1180484426.6273.748.camel@edge> <20070530082218.GA6332@lc4eb6380248654.ibm.com> <1180589911.6273.770.camel@edge> <20070602204048.GA4067@lc4eb6380248654.ibm.com> Content-Type: text/plain Organization: Aconex Date: Fri, 08 Jun 2007 09:57:48 +1000 Message-Id: <1181260668.3758.18.camel@edge.yarra.acx> Mime-Version: 1.0 X-Mailer: Evolution 2.6.3 Content-Transfer-Encoding: 7bit X-archive-position: 1270 X-ecartis-version: Ecartis v1.0.0 Sender: pcp-bounce@oss.sgi.com Errors-to: pcp-bounce@oss.sgi.com X-original-sender: nscott@aconex.com Precedence: bulk X-list: pcp On Sat, 2007-06-02 at 22:40 +0200, Jan-Frode Myklebust wrote: > > I didn't replace /usr/share/pcp/bin/pmie_check, but rather put > your script in /etc/cron.hourly/pmie_check.sh. Unfortunately > it also leaks out new instances for already running pmie's. > Silly question probably - you removed the crontab entry running the original pmie_check though, right? (i.e. it was definately the new script that started these duplicates?) The -V -V output would be really useful, thanks - I'll keep looking at the script, but that output would give me several more hints as to the root cause. cheers. -- Nathan From nscott@aconex.com Thu Jun 7 23:16:53 2007 Received: with ECARTIS (v1.0.0; list pcp); Thu, 07 Jun 2007 23:16:58 -0700 (PDT) Received: from postoffice.aconex.com (mail.app.aconex.com [203.89.192.138]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id l586GpWt022579 for ; Thu, 7 Jun 2007 23:16:52 -0700 Received: from edge.local (unknown [203.89.192.141]) by postoffice.aconex.com (Postfix) with ESMTP id E9D4392C419; Fri, 8 Jun 2007 16:16:51 +1000 (EST) Subject: Re: pmie spawning more than 1 instance per host From: Nathan Scott Reply-To: nscott@aconex.com To: Jan-Frode Myklebust Cc: pcp@oss.sgi.com In-Reply-To: <1181260668.3758.18.camel@edge.yarra.acx> References: <1180484426.6273.748.camel@edge> <20070530082218.GA6332@lc4eb6380248654.ibm.com> <1180589911.6273.770.camel@edge> <20070602204048.GA4067@lc4eb6380248654.ibm.com> <1181260668.3758.18.camel@edge.yarra.acx> Content-Type: text/plain Organization: Aconex Date: Fri, 08 Jun 2007 16:15:47 +1000 Message-Id: <1181283347.3758.26.camel@edge.yarra.acx> Mime-Version: 1.0 X-Mailer: Evolution 2.6.3 Content-Transfer-Encoding: 7bit X-archive-position: 1271 X-ecartis-version: Ecartis v1.0.0 Sender: pcp-bounce@oss.sgi.com Errors-to: pcp-bounce@oss.sgi.com X-original-sender: nscott@aconex.com Precedence: bulk X-list: pcp On Fri, 2007-06-08 at 09:57 +1000, Nathan Scott wrote: > On Sat, 2007-06-02 at 22:40 +0200, Jan-Frode Myklebust wrote: > > > > I didn't replace /usr/share/pcp/bin/pmie_check, but rather put > > your script in /etc/cron.hourly/pmie_check.sh. Unfortunately > > it also leaks out new instances for already running pmie's. > > I'm running 250+ pmies on my machine now, and running pmie_check in a loop with a 10-sec delay between runs - so far, no luck in reproducing the problem. I'll leave it running overnight. Reading through pmie_check.sh hasn't resulted in any additional insight either. > The -V -V output would be really useful, thanks - I'll keep looking > at the script, but that output would give me several more hints as > to the root cause. So, guess that's our best bet for tracking this down further, at this stage (sh -x /usr/share/pcp/bin/pmie_check.sh -V -V would be the next step after that I suppose, but back up a dump truck for the amount of data that will produce!). cheers. -- Nathan From nscott@aconex.com Thu Jun 14 22:47:33 2007 Received: with ECARTIS (v1.0.0; list pcp); Thu, 14 Jun 2007 22:47:42 -0700 (PDT) Received: from postoffice.aconex.com (mail.app.aconex.com [203.89.192.138]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id l5F5lWWt031689 for ; Thu, 14 Jun 2007 22:47:33 -0700 Received: from edge.local (unknown [203.89.192.141]) by postoffice.aconex.com (Postfix) with ESMTP id 12C2F92C443 for ; Fri, 15 Jun 2007 15:47:32 +1000 (EST) Subject: Development trees - kmchart and PCP From: Nathan Scott Reply-To: nscott@aconex.com To: pcp@oss.sgi.com Content-Type: text/plain Organization: Aconex Date: Fri, 15 Jun 2007 15:46:44 +1000 Message-Id: <1181886404.3758.243.camel@edge.yarra.acx> Mime-Version: 1.0 X-Mailer: Evolution 2.6.3 Content-Transfer-Encoding: 7bit X-archive-position: 1272 X-ecartis-version: Ecartis v1.0.0 Sender: pcp-bounce@oss.sgi.com Errors-to: pcp-bounce@oss.sgi.com X-original-sender: nscott@aconex.com Precedence: bulk X-list: pcp Hi all, I've got kmchart to a point that the new code is ready for developers to start to play with. It does have "a few" (sometimes glaring) bugs that I know about which are in the process of being fixed, but it is useable for monitoring in both live and archive mode now, and has alot of new and interesting features (tab support, revised metric selector, kmtime integration, simultaneous live/archive monitoring, record mode, metric info dialog, kitchen sink, unified time axis, better time label scaling, funky icons, etc, etc). At this stage I'm just pushing out my current source trees (not even source tarballs). I'm using git for revision control; if you're new to git, its easy to use - there are docs and source code here: http://www.kernel.org/pub/software/scm/git/docs/tutorial.html http://www.kernel.org/pub/software/scm/git/ You can git clone/pull from the public copy of my git trees here: git://oss.sgi.com:8090/nathans/kmchart.git kmchart tree, see top-level README to get started. Requires Qt3 (I haven't tried Qt4+ yet), and the Qwt snapshot below (I also haven't tried Qwt5+ yet). git://oss.sgi.com:8090/nathans/pcp.git Tree has two branches - "master" which is unmodified PCP code from the 'official' SGI PCP tarballs, and "nathans" which is all of the fixes and features from my working tree. This includes the kmtime extensions to pmval, and all the fixes that aren't yet in mainline PCP releases. git://oss.sgi.com:8090/nathans/qwt.git A good version of Qwt for building kmchart with. Not the latest, but its the one I've been using for kmchart development to date. When kmchart is updated to work with Qwt5, I'll push that version in here too. As a teaser, I've put a screenshot showing some sample kmchart windows over here ("Here's one I prepared earlier...!"): http://oss.sgi.com/~nathans/kmchart-desktop.png Enjoy! I'll start spamming^Wsending a note to the list each time I push updates into any of those trees. cheers. -- Nathan From nscott@aconex.com Thu Jun 14 23:05:04 2007 Received: with ECARTIS (v1.0.0; list pcp); Thu, 14 Jun 2007 23:05:09 -0700 (PDT) Received: from postoffice.aconex.com (mail.app.aconex.com [203.89.192.138]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id l5F650Wt002150 for ; Thu, 14 Jun 2007 23:05:03 -0700 Received: from edge.local (unknown [203.89.192.141]) by postoffice.aconex.com (Postfix) with ESMTP id C776392C5D6; Fri, 15 Jun 2007 16:05:00 +1000 (EST) Subject: Re: Development trees - kmchart and PCP From: Nathan Scott Reply-To: nscott@aconex.com To: Michael Newton Cc: pcp@oss.sgi.com In-Reply-To: <1181886404.3758.243.camel@edge.yarra.acx> References: <1181886404.3758.243.camel@edge.yarra.acx> Content-Type: text/plain Organization: Aconex Date: Fri, 15 Jun 2007 16:04:13 +1000 Message-Id: <1181887453.3758.259.camel@edge.yarra.acx> Mime-Version: 1.0 X-Mailer: Evolution 2.6.3 Content-Transfer-Encoding: 7bit X-archive-position: 1273 X-ecartis-version: Ecartis v1.0.0 Sender: pcp-bounce@oss.sgi.com Errors-to: pcp-bounce@oss.sgi.com X-original-sender: nscott@aconex.com Precedence: bulk X-list: pcp Hi Michael, On Fri, 2007-06-15 at 15:46 +1000, Nathan Scott wrote: > > git://oss.sgi.com:8090/nathans/pcp.git > Tree has two branches - "master" which is unmodified PCP code from > the 'official' SGI PCP tarballs, and "nathans" which is all of the > fixes and features from my working tree. This includes the kmtime > extensions to pmval, and all the fixes that aren't yet in mainline > PCP releases. This tree will be a much saner way for you to pick up PCP patches from me (and others here) ... I'm planning to just send a note when I update this tree, instead of patch-bombing you as I've done in the past, so you can pull from it at your leisure. Below is the current set of patches in the "nathans" branch. The git docs describe how to extract a patch from git in a form that should be easy to drop into your ptools tree(s) - see my example at the end. cheers. -- Nathan 15:53 nathans@edge ~ 1> cd /source/git/pcp 15:54 nathans@edge /source/git/pcp 2> git branch * master nathans origin 15:54 nathans@edge /source/git/pcp 3> git log commit 0d122added4cb56af881cc943bf2d2f26c160d1e Author: Nathan Scott Date: Thu Jun 14 10:47:08 2007 +1000 Initial commit of SGI 2.7.1-1 PCP source. 15:54 nathans@edge /source/git/pcp 4> git checkout nathans 15:54 nathans@edge /source/git/pcp 5> git branch master * nathans origin 15:54 nathans@edge /source/git/pcp 6> git log commit 65946bf7339338963e0b468dd1608c2d479a8d75 Author: Nathan Scott Date: Thu Jun 14 10:32:06 2007 +1000 Append current date string to build number commit 57be02a88757b81d2b488da1b9dc553dd6f6668e Author: Nathan Scott Date: Thu Jun 14 10:32:06 2007 +1000 Update pmie start and stop scripts Incorporate bugfixes and platform independence changes from the pmlogger_check script into pmie_check. commit 56dd951709324ae0786ef35d7c16321185120a44 Author: Nathan Scott Date: Thu Jun 14 10:32:06 2007 +1000 pmlogsummary report sum option Allow pmloggsummary to report the sum() of metrics/instances, in addition to all of the other statistics reported. commit 9635c7f5f4eec7f3694f82bb3d399ab0be8eff24 Author: Nathan Scott Date: Thu Jun 14 10:32:06 2007 +1000 Fix Windows uname metrics Somehow, this chunk of an earlier Windows PMDA patch was missed, and theres no code behind the pmda.* metrics there at the moment. commit a3a12b249675f6ad9efcb0e57e083d2ca7fd62f9 Author: Nathan Scott Date: Thu Jun 14 10:32:06 2007 +1000 Fix PCP start script regressions Partially revert a change to the pcp start script, so that it will attempt to shutdown running pmcd processes under all circumstances. Also ensure the $PCP_RUN_DIR exists, else pmcd fails to start from this script on Windows at least. commit 8e7f8da4e316087772db5d45edd122353ce7289a Author: Nathan Scott Date: Thu Jun 14 10:32:06 2007 +1000 pmval kmtime support Introduce libkmtime, which provides a kmtime client interface for C/C++ programs using a similar protocol to that used by pmtime. This library contains no GUI code (adds no additional dependencies to PCP - used POSIX calls only). It provides time sychronisation for PCP tools that support it. Support has been added to pmval and this co-exists with its existing pmtime support. The kmtime tool is entirely separate, and is packaged and released separately to PCP itself (it does depend on certain GUI libraries, of course). An updated version of Ken's kmchart utility, which now uses kmtime, is also being made generally available. commit 462af4362ec3df582960daba675408acec838304 Author: Nathan Scott Date: Thu Jun 14 10:32:06 2007 +1000 Generate LSM file The version number update was missed in the old LSM file. This patch makes the LSM (Linux Software Map) file configure-generated so that its stamped with the correct version number, date, etc, without any manual intervention. Also updated PCP to use version 4 of of the LSM format (incl. differently named file). commit c643892bb0f8be3d64a39e13933dcf24d14197d4 Author: David Chatterton Date: Thu Jun 14 10:32:06 2007 +1000 Upgrade Windows compiler version Switch to newer compiler version for native Windows components. commit aabb75f1f9a5a735c5675697ba7a81cf9cbdfc34 Author: Nathan Scott Date: Thu Jun 14 10:32:06 2007 +1000 Revert syslog for Cygwin Cygwin build fails with this recent addition, so reverting it. Possibly a missing (not yet installed?) header on the machine where the problem that lead to this change was observed? commit af6398aef276f250782017e3eb5b110b7219a42f Author: Ken McDonell Date: Thu Jun 14 10:32:06 2007 +1000 Windows uuencoded binaries Add in new uuencoded shim and show-all-counters binary files, to make merging updates simpler (ascii text is patch-able). commit 1500bd088898317c42eede0f0748f1fd09989c69 Author: Ken McDonell Date: Thu Jun 14 10:32:06 2007 +1000 Additional Linux SNMP metrics Support for additional network.udp and network.udplite metrics, which are exported through the /proc/net/snmp kernel interface. commit bb8a668bcb72283a994b85316e3de96d2f07a2ed Author: Ken McDonell Date: Thu Jun 14 10:32:06 2007 +1000 Fix Linux vmstat nr_slab metrics Update the Linux PMDA vmstat metrics, after changes and additions to these counters in 2.6.18 virtual memory subsystem. commit cf618856627c2354566f2c7d50daf3feecc98034 Author: Nathan Scott Date: Thu Jun 14 10:32:06 2007 +1000 Make pcp status command report build version Update the pcp(1) command to report build version. Add in the missing help text for pmcd.build as well. commit 329aada114b0dd0cc5798ecaf23cb027b2447daa Author: Ken McDonell Date: Thu Jun 14 10:32:06 2007 +1000 Windows split_io metrics Add support for the per-device and aggregate Windows "Split IO" performance counters. commit a058a8446b208a30583873689414b9666ade354a Author: Nathan Scott Date: Thu Jun 14 10:32:06 2007 +1000 Fix pmdapmcd empty pmie instance Fix the handling of the pmie instance domain, such that when pmie instances are stopped, the pmcd PMDA instance domain is correctly updated in all cases. commit 30be08b73af760067f13c928dbfd88e8a4c99caf Author: Nathan Scott Date: Thu Jun 14 10:32:06 2007 +1000 pmdamailq filename regex Allow the mailq agent to be used for MTAs other than sendmail, by providing a way for regular expressions to be used to match mail message files instead of hard-coding the sendmail naming convention. commit b6d58c31f4845963fccf14616ef9ed6e59a5e25b Author: Nathan Scott Date: Thu Jun 14 10:32:06 2007 +1000 Fix Windows filesys.used metric Fix an error in the Windows PMDA filesystem freespace calculation. commit ecab101bea0fc58ff43103d0e32c38dd34254202 Author: Nathan Scott Date: Thu Jun 14 10:32:06 2007 +1000 Quiet libpcp_pmda with appversion error code PM_ERR_APPVERSION is a reasonable value for a PMDA fetch callback to return - this change stops libpcp_pmda from filling such agents logs with warnings when this is used, unless in debug mode. commit 8bf3c03805c97b3db0faf93c693af1a6f65b141e Author: David Chatterton Date: Thu Jun 14 10:32:06 2007 +1000 Windows DDK unresolved symbol fix Fix compilation of parts of the Windows PMDA with newer versions of the Windows C compiler (and DDK). commit f40b51dbeb9e93dcd82aab1d861f1683fded86fe Author: Nathan Scott Date: Thu Jun 14 10:32:06 2007 +1000 Windows TCP metrics Add support for the network.tcp metric hierarchy to the Windows agent (naming convention matches those used on other platforms). commit fa5665062ea327e84453d78b97f10cf598d5a7f3 Author: Nathan Scott Date: Thu Jun 14 10:32:06 2007 +1000 Fix Linux context switch metric value Fix a cut-and-paste error in the value reported by the Linux PMDA for context switches. commit 8d5c06982c523c25d5339d2f22fe190798c7c60e Author: Nathan Scott Date: Thu Jun 14 10:32:06 2007 +1000 Fix filesys.free calculation to use b_avail Most filesystems reserve an amount of space that can only be allocated by the root user, once out-of-space conditions are reached. This is reflected by differences in the statfs(2) b_avail and b_free values. This patch makes PCP filesystem metric values match up with reality according to df(1). commit c8d743924d0ee89f6566c5335efe09b70bfa101e Author: Nathan Scott Date: Thu Jun 14 10:32:06 2007 +1000 Fix objstyle magic file checks Add another place to search for magic files, which fixes some build issues on certain recent Linux distributions. Also fix a shell error from some versions of sh(1) where whitespace is required before a tests closing square bracket. commit 95395026d5e47cb23d138d5cd25bd32a07b9a7a5 Author: Nathan Scott Date: Thu Jun 14 10:32:06 2007 +1000 pmie log file rotation Allow pmie logs to be rotated as part of the regular daily PCP house-keeping scripts (introducing a pmie_daily to do this). The pmie "print" action, when used in conjunction with other actions for regular alarming, can provide an invaluable audit trail of historical rule firings. However, in production one cannot have unbounded log growth - so, this implements rotation, compression, and culling in a similar way to the pmlogger log management. At the same time I fixed several small issues that were found while reviewing the pmlogger daily archive maintenance scripts. I also changed the default compression program to be bzip2(1), from compress (which doesn't seem to exist on any PCP platforms other than IRIX). commit 6d489840bede03c31ecb571c5593afe2bfab8d8a Author: Ken McDonell Date: Thu Jun 14 10:32:06 2007 +1000 Fix pmie %sample calculations Resolve a rounding problem in the pmie percent-{sample,host,inst} expressions. The percent calculations could result in values which should have been zero, but were very small positive/negative values. End result was rules not evaluating to true when they should have. commit c323229fe46a2d550f420664b0d9082f87e7140c Author: Nathan Scott Date: Thu Jun 14 10:32:06 2007 +1000 Convert instant to discrete in numerous metrics When using the pmlogger log-once directive for discrete metrics, it helps alot if they actually are exported as discrete metrics. When they are not, the client tools are unable to operate correctly on the daily archives produced by pmlogger_daily, as interpolation is attempted (but cannot be done with just the one value). This fixes a fistful of kernel metrics for Linux, Mac and Windows that exhibited this problem. commit ef7ac2472b53f760e8cc5710b5d9b2e16b238480 Author: Nathan Scott Date: Thu Jun 14 10:32:06 2007 +1000 Replace sginap with nanosleep Remove use of sginap from several tools. The sginap interface is terrible for our needs in PCP - its argument is a number-of-jiffies which means all tools using it have to multiply their time-to-sleep by CLK_TCK at each call site. The non-IRIX implementations all then promptly divide by CLK_TCK to get back to the parameter which they pass to the underlying sleep routine. Worse, since CLK_TCK is actually a large number, when using a pmie rule with a relatively large delta (1 hour), the multiplication also overflows, and incorrect values are reported. This patch changes pmie and pmval (which were the tools I was using at the time) to use the POSIX nanosleep(2) interface. Probably this patch should go further and remove each and every reference to sginap from the PCP source, but I've not done that at this stage. That may also result in some autoconf cleanup, to rid us of all the usleep / sleep / what-have-you platform-specific checks. AFAICT, nanosleep seems to exist on every supported platform. commit 28a73b5100b09f75d45b28b2369d21cc040ce427 Author: Ken McDonell Date: Thu Jun 14 10:32:06 2007 +1000 Fix pmie constant folding Fix a bug when pmie evaluates rules that make multiple references to a single constant, where it was incorrectly folding a constant after the first use. Corruption of pmie's address space results, when subsequent rules make use of the folded constant expression, which has been observed as causing SIGSEGVs, infinite loops, and other random wierdness. commit c82841863880e9d6149c91368f97f855f060900d Author: Nathan Scott Date: Thu Jun 14 10:32:06 2007 +1000 Fix Linux proc metric units Update the metric units for several Linux PMDA counter metrics in the proc hierarchy. commit 1deedfaafd53cc021e882bad23b996eeef4e04d5 Author: Nathan Scott Date: Thu Jun 14 10:32:06 2007 +1000 Fix pmie stomp action in archive mode Ensure related functions are kept together in pmie, this convention was accidentally dropped earlier. Also ensure we don't act on stomp actions in archive mode (as is done for other action types). commit 0d122added4cb56af881cc943bf2d2f26c160d1e Author: Nathan Scott Date: Thu Jun 14 10:47:08 2007 +1000 Initial commit of SGI 2.7.1-1 PCP source. 15:54 nathans@edge /source/git/pcp 7> 15:55 nathans@edge /source/git/pcp 7> git diff 1deedfaafd53cc021e882bad23b996eeef4e04d5..c82841863880e9d6149c91368f97f855f060900d diff --git a/src/pmdas/linux/pmda.c b/src/pmdas/linux/pmda.c index 61ad5fd..227cb1d 100644 --- a/src/pmdas/linux/pmda.c +++ b/src/pmdas/linux/pmda.c @@ -1163,22 +1163,22 @@ static pmdaMetric metrictab[] = { /* proc.psinfo.minflt */ { NULL, { PMDA_PMID(CLUSTER_PID_STAT,9), PM_TYPE_U32, PROC_INDOM, PM_SEM_COUNTER, - PMDA_PMUNITS(0,0,0,0,0,0) } }, + PMDA_PMUNITS(0,0,1,0,0,PM_COUNT_ONE) } }, /* proc.psinfo.cmin_flt */ { NULL, { PMDA_PMID(CLUSTER_PID_STAT,10), PM_TYPE_U32, PROC_INDOM, PM_SEM_COUNTER, - PMDA_PMUNITS(0,0,0,0,0,0) } }, + PMDA_PMUNITS(0,0,1,0,0,PM_COUNT_ONE) } }, /* proc.psinfo.maj_flt */ { NULL, { PMDA_PMID(CLUSTER_PID_STAT,11), PM_TYPE_U32, PROC_INDOM, PM_SEM_COUNTER, - PMDA_PMUNITS(0,0,0,0,0,0) } }, + PMDA_PMUNITS(0,0,1,0,0,PM_COUNT_ONE) } }, 15:55 nathans@edge /source/git/pcp 8> From markgw@sgi.com Mon Jun 18 12:47:08 2007 Received: with ECARTIS (v1.0.0; list pcp); Mon, 18 Jun 2007 12:47:40 -0700 (PDT) Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with SMTP id l5IJkvdq002574 for ; Mon, 18 Jun 2007 12:47:06 -0700 Received: from [134.14.55.17] (dhcp17.melbourne.sgi.com [134.14.55.17]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id PAA05869; Mon, 18 Jun 2007 15:55:10 +1000 Message-ID: <46761E18.501@sgi.com> Date: Mon, 18 Jun 2007 15:54:32 +1000 From: Mark Goodwin Reply-To: markgw@sgi.com Organization: SGI Engineering User-Agent: Thunderbird 1.5.0.12 (Windows/20070509) MIME-Version: 1.0 To: nscott@aconex.com, pcp@oss.sgi.com Subject: Re: Development trees - kmchart and PCP References: <1181886404.3758.243.camel@edge.yarra.acx> <1181887453.3758.259.camel@edge.yarra.acx> <1182143055.30716.5.camel@edge.yarra.acx> In-Reply-To: <1182143055.30716.5.camel@edge.yarra.acx> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 1274 X-ecartis-version: Ecartis v1.0.0 Sender: pcp-bounce@oss.sgi.com Errors-to: pcp-bounce@oss.sgi.com X-original-sender: markgw@sgi.com Precedence: bulk X-list: pcp Nathan, how do you want to package kmchart, kmtime, et al? a) just via your git tree (i.e. src only, pull it down and build it yourself), or b) integrate into the existing open source tree with some process for regular merges from your git tree, or c) ? If (b), we would include it as part of the pre-built RPMs on oss.sgi.com. Anyone working on kmgadgets yet ;-) Cheers -- Mark From nscott@aconex.com Mon Jun 18 13:14:26 2007 Received: with ECARTIS (v1.0.0; list pcp); Mon, 18 Jun 2007 13:14:32 -0700 (PDT) Received: from postoffice.aconex.com (mail.app.aconex.com [203.89.192.138]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id l5IKEOdo015830 for ; Mon, 18 Jun 2007 13:14:26 -0700 Received: from edge.local (unknown [203.89.192.141]) by postoffice.aconex.com (Postfix) with ESMTP id 14CBC92C4D3; Mon, 18 Jun 2007 16:02:12 +1000 (EST) Subject: Re: Development trees - kmchart and PCP From: Nathan Scott Reply-To: nscott@aconex.com To: markgw@sgi.com Cc: pcp@oss.sgi.com In-Reply-To: <46761E18.501@sgi.com> References: <1181886404.3758.243.camel@edge.yarra.acx> <1181887453.3758.259.camel@edge.yarra.acx> <1182143055.30716.5.camel@edge.yarra.acx> <46761E18.501@sgi.com> Content-Type: text/plain Organization: Aconex Date: Mon, 18 Jun 2007 16:01:31 +1000 Message-Id: <1182146491.30716.12.camel@edge.yarra.acx> Mime-Version: 1.0 X-Mailer: Evolution 2.6.3 Content-Transfer-Encoding: 7bit X-archive-position: 1275 X-ecartis-version: Ecartis v1.0.0 Sender: pcp-bounce@oss.sgi.com Errors-to: pcp-bounce@oss.sgi.com X-original-sender: nscott@aconex.com Precedence: bulk X-list: pcp Hi Mark, On Mon, 2007-06-18 at 15:54 +1000, Mark Goodwin wrote: > > Nathan, how do you want to package kmchart, kmtime, et al? > > a) just via your git tree (i.e. src only, pull it down and build > it yourself), or > > b) integrate into the existing open source tree with > some process for regular merges from your git tree, or > > c) ? The kmchart tree has a similar build infrastructure to PCP - you type ./Makepkgs in the toplevel and it spits out rpms, tarballs, etc at the end. > > If (b), we would include it as part of the pre-built RPMs on > oss.sgi.com. kmchart has "exotic" dependencies (Qt, Qwt), so not a great idea to merge it into base pcp ... when its a bit closer to complete, it'd make sense to put pre-built kmchart rpms, etc on oss, I think. > Anyone working on kmgadgets yet ;-) Heh - kmchart still has a todo list (incl. bug fixes) as long as both my arms... so, not me. :) cheers. -- Nathan From nscott@aconex.com Mon Jun 18 17:09:13 2007 Received: with ECARTIS (v1.0.0; list pcp); Mon, 18 Jun 2007 17:09:19 -0700 (PDT) Received: from postoffice.aconex.com (mail.app.aconex.com [203.89.192.138]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id l5J09Cdo031488 for ; Mon, 18 Jun 2007 17:09:13 -0700 Received: from edge.local (unknown [203.89.192.141]) by postoffice.aconex.com (Postfix) with ESMTP id 4173E92C49A for ; Tue, 19 Jun 2007 10:09:13 +1000 (EST) Subject: kmchart updates From: Nathan Scott Reply-To: nscott@aconex.com To: pcp@oss.sgi.com Content-Type: text/plain Organization: Aconex Date: Tue, 19 Jun 2007 10:08:34 +1000 Message-Id: <1182211714.30716.25.camel@edge.yarra.acx> Mime-Version: 1.0 X-Mailer: Evolution 2.6.3 Content-Transfer-Encoding: 7bit X-archive-position: 1276 X-ecartis-version: Ecartis v1.0.0 Sender: pcp-bounce@oss.sgi.com Errors-to: pcp-bounce@oss.sgi.com X-original-sender: nscott@aconex.com Precedence: bulk X-list: pcp Changes committed to git://oss.sgi.com:8090/nathans/kmchart.git images/archive.svg | 4 images/back_archive.svg | 18 - images/back_off.svg | 14 - images/back_on.svg | 14 - images/document-properties.png |binary images/document-properties.svg | 551 +++++++++++++++++++-------------------- images/fastback_archive.svg | 24 - images/fastback_off.svg | 24 - images/fastback_on.svg | 24 - images/fastfwd_archive.svg | 24 - images/fastfwd_off.svg | 24 - images/fastfwd_on.svg | 24 - images/filearchive.svg | 4 images/fileusers.svg | 537 -------------------------------------- images/fileview.svg | 574 +++++++++++++++++++++++++++++++++++++++++ images/folio.svg | 4 images/kmchart.svg | 4 images/logfile.svg | 4 images/play_archive.svg | 18 - images/play_live.svg | 18 - images/play_off.svg | 14 - images/play_on.svg | 14 - images/play_record.svg | 18 - images/stepback_archive.svg | 14 - images/stepback_off.svg | 14 - images/stepback_on.svg | 14 - images/stepfwd_archive.svg | 14 - images/stepfwd_off.svg | 14 - images/stepfwd_on.svg | 14 - images/stop_archive.svg | 18 - images/stop_live.svg | 18 - images/stop_off.svg | 14 - images/stop_on.svg | 8 images/stop_record.svg | 18 - images/tab-edit.png |binary images/tab-edit.svg | 112 ++++++-- images/toolarchive.png |binary images/toolarchive.svg | 491 +++++++++++++++++++++++++++++++++++ images/toolusers.png |binary images/toolusers.svg | 540 ++++++++++++++++++++++++++++++++++++++ images/toolview.png |binary images/toolview.svg | 574 +++++++++++++++++++++++++++++++++++++++++ images/view.svg | 4 src/chart/chart.cpp | 101 +++---- src/chart/chart.h | 1 src/chart/kmchart.pro | 6 src/chart/kmchart.ui.h | 23 + src/chart/main.cpp | 123 +++++--- src/chart/main.h | 11 src/chart/settingsdialog.ui.h | 4 src/chart/source.cpp | 2 src/chart/tab.cpp | 87 ++++-- src/chart/tab.h | 9 src/chart/tabdialog.ui | 190 ++++++++++--- src/chart/tabdialog.ui.h | 74 ++++- src/chart/view.cpp | 11 57 files changed, 3206 insertions(+), 1271 deletions(-) commit 4a6e337aa88910de1dcf5abeaac8762a47e7662f Author: Nathan Scott Date: Tue Jun 19 09:36:01 2007 +1000 Several visible/sample history size related fixes. commit 941118d4b9390683c85c849169dbba34fe086a26 Author: Nathan Scott Date: Sat Jun 16 13:34:50 2007 +1000 Fix icons sizes on view dialog toolbar, update paths in all svg files. From nscott@aconex.com Mon Jun 18 23:32:37 2007 Received: with ECARTIS (v1.0.0; list pcp); Mon, 18 Jun 2007 23:32:42 -0700 (PDT) Received: from postoffice.aconex.com (mail.app.aconex.com [203.89.192.138]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id l5J6WZdo024976 for ; Mon, 18 Jun 2007 23:32:36 -0700 Received: from edge.local (unknown [203.89.192.141]) by postoffice.aconex.com (Postfix) with ESMTP id A307692C644; Tue, 19 Jun 2007 16:32:33 +1000 (EST) Subject: Review: expand the scope of pmie logical and/or expressions From: Nathan Scott Reply-To: nscott@aconex.com To: kmcdonell@aconex.com Cc: pcp@oss.sgi.com Content-Type: multipart/mixed; boundary="=-PRaSo4PlXKLlZWTRzI1T" Organization: Aconex Date: Tue, 19 Jun 2007 16:31:09 +1000 Message-Id: <1182234669.4249.15.camel@edge.yarra.acx> Mime-Version: 1.0 X-Mailer: Evolution 2.6.3 X-archive-position: 1277 X-ecartis-version: Ecartis v1.0.0 Sender: pcp-bounce@oss.sgi.com Errors-to: pcp-bounce@oss.sgi.com X-original-sender: nscott@aconex.com Precedence: bulk X-list: pcp --=-PRaSo4PlXKLlZWTRzI1T Content-Type: text/plain Content-Transfer-Encoding: 7bit Hi Ken, Any chance you could look over the attached patch for me please? This implements the changes to the pmie logical OR operator that we discussed - namely, allowing an expression to evaluate to true when only one side of the expression tree can be evaluated (i.e. due to host down / instance unavailable / insufficient samples). I realised a similar issue exists with the AND operator after we spoke, in that it can successfully evaluate to false even though it may not have all data available for an expression. The patch seems to work correctly for the following test cases, but wouldn't mind you taking a detailed look, with your pmie-fu black belt on. It takes the approach we discussed, of pulling these "special" operators out of the skeleton/generated code infrastructure and hand-coding them (based on the original code). Thanks! 16:10 nathans@edge ~/pmie 133> cat testor one = some_inst (sample.noinst > 0) -> print "Should be unknown always\n"; two = some_inst (sample.noinst > 0) && hinv.ncpu > 0 -> print "Should be false always\n"; three = some_inst (sample.noinst > 0) || hinv.ncpu > 0 -> print "Should be true if fixed\n"; four = some_inst (0 < sample.noinst) || hinv.ncpu > 0 -> print "Should be true if fixed\n"; five = sample.bad.unknown == 42 || hinv.ncpu > 0 -> print "Should be true if fixed\n"; six = sample.bad.unknown == 42 || some_inst (sample.noinst > 0) -> print "Should always be unknown\n"; seven = some_inst (sample.noinst > 0) && hinv.ncpu == 0 -> print "Should be false if fixed\n"; 16:12 nathans@edge ~/pmie 134> pmie -c testor -v -t 1 pmie: metric sample.bad.unknown not currently available from host edge pmLookupDesc failed: Unknown or illegal metric identifier pmie: metric sample.bad.unknown not currently available from host edge pmLookupDesc failed: Unknown or illegal metric identifier one: ? two: ? three: ? four: ? five: ? six: ? seven: ? one: ? two: ? three: ? four: ? five: ? six: ? seven: ? 16:13 nathans@edge ~/pmie 135> /source/git/pcp/src/pmie/src/pmie -c testor -v -t 1 pmie: metric sample.bad.unknown not currently available from host edge pmLookupDesc failed: Unknown or illegal metric identifier pmie: metric sample.bad.unknown not currently available from host edge pmLookupDesc failed: Unknown or illegal metric identifier Tue Jun 19 16:13:09 2007: Should be true if fixed Tue Jun 19 16:13:09 2007: Should be true if fixed Tue Jun 19 16:13:09 2007: Should be true if fixed one: ? two: ? three: true four: true five: true six: ? seven: false Tue Jun 19 16:13:10 2007: Should be true if fixed Tue Jun 19 16:13:10 2007: Should be true if fixed Tue Jun 19 16:13:10 2007: Should be true if fixed one: ? two: ? three: true four: true five: true six: ? seven: false 16:13 nathans@edge ~/pmie 136> --=-PRaSo4PlXKLlZWTRzI1T Content-Disposition: attachment; filename=pmie.diff Content-Type: text/x-patch; name=pmie.diff; charset=UTF-8 Content-Transfer-Encoding: 7bit diff --git a/src/pmie/src/GNUmakefile b/src/pmie/src/GNUmakefile index 736381d..52251ff 100644 --- a/src/pmie/src/GNUmakefile +++ b/src/pmie/src/GNUmakefile @@ -35,10 +35,10 @@ include $(TOPDIR)/src/include/builddefs TARGET = pmie$(EXECSUFFIX) CFILES = pmie.c symbol.c dstruct.c lexicon.c syntax.c pragmatics.c eval.c \ - show.c match_inst.c syslog.c stomp.c + show.c match_inst.c syslog.c stomp.c conjunct.c HFILES = fun.h dstruct.h eval.h lexicon.h pmiestats.h pragmatics.h \ - show.h symbol.h syntax.h syslog.h stomp.h + show.h symbol.h syntax.h syslog.h stomp.h conjunct.h SKELETAL = hdr.sk fetch.sk misc.sk aggregate.sk unary.sk binary.sk \ merge.sk act.sk diff --git a/src/pmie/src/conjunct.c b/src/pmie/src/conjunct.c new file mode 100644 index 0000000..a31a731 --- /dev/null +++ b/src/pmie/src/conjunct.c @@ -0,0 +1,552 @@ +/* + * Copyright (c) 1995-2002 Silicon Graphics, Inc. All Rights Reserved. + * + * This program is free software; you can redistribute it and/or modify it + * under the terms of the GNU General Public License as published by the + * Free Software Foundation; either version 2 of the License, or (at your + * option) any later version. + * + * This program is distributed in the hope that it will be useful, but + * WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY + * or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License + * for more details. + * + * You should have received a copy of the GNU General Public License along + * with this program; if not, write to the Free Software Foundation, Inc., + * 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA + * + * Contact information: Silicon Graphics, Inc., 1500 Crittenden Lane, + * Mountain View, CA 94043, USA, or: http://www.sgi.com + */ + +/*********************************************************************** + * conjunct.c + * + * These functions were originally generated from skeletons .sk + * by the shell-script './meta', then modified to support the semantics + * of the boolean AND/OR operators correctly. These are different to + * every other operator in that they do not always require both sides + * of the expression to be available in order to be evaluated, i.e. + * OR: if either side of the expression is true, expr is true + * AND: if either side of the expression is false, expr is false + ***********************************************************************/ + +#include +#include +#include +#include +#include +#include +#include +#include +#include "pmapi.h" +#include "dstruct.h" +#include "pragmatics.h" +#include "fun.h" +#include "show.h" +#include "stomp.h" + + +/* + * operator: cndOr + */ + +#define OR(x,y) (((x) == TRUE || (y) == TRUE) ? TRUE : (((x) == FALSE && (y) == FALSE) ? FALSE : DUNNO)) +#define OR1(x) (((x) == TRUE) ? TRUE : DUNNO) + +void +cndOr_n_n(Expr *x) +{ + Expr *arg1 = x->arg1; + Expr *arg2 = x->arg2; + Sample *is1 = &arg1->smpls[0]; + Sample *is2 = &arg2->smpls[0]; + Sample *os = &x->smpls[0]; + Truth *ip1; + Truth *ip2; + Truth *op; + int n; + int i; + + EVALARG(arg1) + EVALARG(arg2) + ROTATE(x) + + if (arg1->valid && arg2->valid) { + ip1 = (Truth *)is1->ptr; + ip2 = (Truth *)is2->ptr; + op = (Truth *)os->ptr; + n = x->tspan; + for (i = 0; i < n; i++) { + *op++ = OR(*ip1, *ip2); + ip1++; + ip2++; + } + os->stamp = (is1->stamp > is2->stamp) ? is1->stamp : is2->stamp; + x->valid++; + } + else if (arg1->valid) { + Truth answer = DUNNO; + + ip1 = (Truth *)is1->ptr; + op = (Truth *)os->ptr; + n = x->tspan; + for (i = 0; i < n; i++) { + if ((*op++ = OR1(*ip1)) == TRUE) + answer = TRUE; + ip1++; + } + if (answer == TRUE) { + os->stamp = is1->stamp; + x->valid++; + } + else x->valid = 0; + } + else if (arg2->valid) { + Truth answer = DUNNO; + + ip2 = (Truth *)is2->ptr; + op = (Truth *)os->ptr; + n = x->tspan; + for (i = 0; i < n; i++) { + if ((*op++ = OR1(*ip2)) == TRUE) + answer = TRUE; + ip2++; + } + if (answer == TRUE) { + os->stamp = is2->stamp; + x->valid++; + } + else x->valid = 0; + } + else x->valid = 0; + +#if PCP_DEBUG + if (pmDebug & DBG_TRACE_APPL2) { + fprintf(stderr, "cndOr_n_n(" PRINTF_P_PFX "%p) ...\n", x); + dumpExpr(x); + } +#endif +} + +void +cndOr_n_1(Expr *x) +{ + Expr *arg1 = x->arg1; + Expr *arg2 = x->arg2; + Sample *is1 = &arg1->smpls[0]; + Sample *is2 = &arg2->smpls[0]; + Sample *os = &x->smpls[0]; + Truth *ip1; + Truth iv2; + Truth *op; + int n; + int i; + + EVALARG(arg1) + EVALARG(arg2) + ROTATE(x) + + if (arg1->valid && arg2->valid) { + ip1 = (Truth *)is1->ptr; + iv2 = *(Truth *)is2->ptr; + op = (Truth *)os->ptr; + n = x->tspan; + for (i = 0; i < n; i++) { + *op++ = OR(*ip1, iv2); + ip1++; + } + os->stamp = (is1->stamp > is2->stamp) ? is1->stamp : is2->stamp; + x->valid++; + } + else if (arg1->valid) { + Truth answer = DUNNO; + + ip1 = (Truth *)is1->ptr; + op = (Truth *)os->ptr; + n = x->tspan; + for (i = 0; i < n; i++) { + if ((*op++ = OR1(*ip1)) == TRUE) + answer = TRUE; + ip1++; + } + if (answer == TRUE) { + os->stamp = is1->stamp; + x->valid++; + } + else x->valid = 0; + } + else if (arg2->valid) { + if ((*(Truth *)os->ptr = OR1(*(Truth *)is2->ptr)) == TRUE) { + os->stamp = is2->stamp; + x->valid++; + } + else x->valid = 0; + } + else x->valid = 0; + +#if PCP_DEBUG + if (pmDebug & DBG_TRACE_APPL2) { + fprintf(stderr, "cndOr_n_1(" PRINTF_P_PFX "%p) ...\n", x); + dumpExpr(x); + } +#endif +} + +void +cndOr_1_n(Expr *x) +{ + Expr *arg1 = x->arg1; + Expr *arg2 = x->arg2; + Sample *is1 = &arg1->smpls[0]; + Sample *is2 = &arg2->smpls[0]; + Sample *os = &x->smpls[0]; + Truth iv1; + Truth *ip2; + Truth *op; + int n; + int i; + + EVALARG(arg1) + EVALARG(arg2) + ROTATE(x) + + if (arg1->valid && arg2->valid) { + iv1 = *(Truth *)is1->ptr; + ip2 = (Truth *)is2->ptr; + op = (Truth *)os->ptr; + n = x->tspan; + for (i = 0; i < n; i++) { + *op++ = OR(iv1, *ip2); + ip2++; + } + os->stamp = (is1->stamp > is2->stamp) ? is1->stamp : is2->stamp; + x->valid++; + } + else if (arg1->valid) { + if ((*(Truth *)os->ptr = OR1(*(Truth *)is1->ptr)) == TRUE) { + os->stamp = is1->stamp; + x->valid++; + } + else x->valid = 0; + } + else if (arg2->valid) { + Truth answer = DUNNO; + + ip2 = (Truth *)is2->ptr; + op = (Truth *)os->ptr; + n = x->tspan; + for (i = 0; i < n; i++) { + if ((*op++ = OR1(*ip2)) == TRUE) + answer = TRUE; + ip2++; + } + if (answer == TRUE) { + os->stamp = is2->stamp; + x->valid++; + } + else x->valid = 0; + } + else x->valid = 0; + +#if PCP_DEBUG + if (pmDebug & DBG_TRACE_APPL2) { + fprintf(stderr, "cndOr_1_n(" PRINTF_P_PFX "%p) ...\n", x); + dumpExpr(x); + } +#endif +} + +void +cndOr_1_1(Expr *x) +{ + Expr *arg1 = x->arg1; + Expr *arg2 = x->arg2; + Sample *is1 = &arg1->smpls[0]; + Sample *is2 = &arg2->smpls[0]; + Sample *os = &x->smpls[0]; + + EVALARG(arg1) + EVALARG(arg2) + ROTATE(x) + + if (arg1->valid && arg2->valid) { + *(Truth *)os->ptr = OR(*(Truth *)is1->ptr, *(Truth *)is2->ptr); + os->stamp = (is1->stamp > is2->stamp) ? is1->stamp : is2->stamp; + x->valid++; + } + else if (arg1->valid) { + if ((*(Truth *)os->ptr = OR1(*(Truth *)is1->ptr)) == TRUE) { + os->stamp = is1->stamp; + x->valid++; + } + else x->valid = 0; + } + else if (arg2->valid) { + if ((*(Truth *)os->ptr = OR1(*(Truth *)is2->ptr)) == TRUE) { + os->stamp = is2->stamp; + x->valid++; + } + else x->valid = 0; + } + else x->valid = 0; + +#if PCP_DEBUG + if (pmDebug & DBG_TRACE_APPL2) { + fprintf(stderr, "cndOr_1_1(" PRINTF_P_PFX "%p) ...\n", x); + dumpExpr(x); + } +#endif +} + +/* + * operator: cndAnd + */ + +#define AND(x,y) (((x) == TRUE && (y) == TRUE) ? TRUE : (((x) == FALSE || (y) == FALSE) ? FALSE : DUNNO)) +#define AND1(x) (((x) == FALSE) ? FALSE : DUNNO) + +void +cndAnd_n_n(Expr *x) +{ + Expr *arg1 = x->arg1; + Expr *arg2 = x->arg2; + Sample *is1 = &arg1->smpls[0]; + Sample *is2 = &arg2->smpls[0]; + Sample *os = &x->smpls[0]; + Truth *ip1; + Truth *ip2; + Truth *op; + int n; + int i; + + EVALARG(arg1) + EVALARG(arg2) + ROTATE(x) + + if (arg1->valid && arg2->valid) { + ip1 = (Truth *)is1->ptr; + ip2 = (Truth *)is2->ptr; + op = (Truth *)os->ptr; + n = x->tspan; + for (i = 0; i < n; i++) { + *op++ = AND(*ip1, *ip2); + ip1++; + ip2++; + } + os->stamp = (is1->stamp > is2->stamp) ? is1->stamp : is2->stamp; + x->valid++; + } + else if (arg1->valid) { + Truth answer = DUNNO; + + ip1 = (Truth *)is1->ptr; + op = (Truth *)os->ptr; + n = x->tspan; + for (i = 0; i < n; i++) { + if ((*op++ = AND1(*ip1)) == FALSE) + answer = FALSE; + ip1++; + } + if (answer == FALSE) { + os->stamp = is1->stamp; + x->valid++; + } + else x->valid = 0; + } + else if (arg2->valid) { + Truth answer = DUNNO; + + ip1 = (Truth *)is2->ptr; + op = (Truth *)os->ptr; + n = x->tspan; + for (i = 0; i < n; i++) { + if ((*op++ = AND1(*ip2)) == FALSE) + answer = FALSE; + ip2++; + } + if (answer == FALSE) { + os->stamp = is2->stamp; + x->valid++; + } + else x->valid = 0; + } + else x->valid = 0; + +#if PCP_DEBUG + if (pmDebug & DBG_TRACE_APPL2) { + fprintf(stderr, "cndAnd_n_n(" PRINTF_P_PFX "%p) ...\n", x); + dumpExpr(x); + } +#endif +} + +void +cndAnd_n_1(Expr *x) +{ + Expr *arg1 = x->arg1; + Expr *arg2 = x->arg2; + Sample *is1 = &arg1->smpls[0]; + Sample *is2 = &arg2->smpls[0]; + Sample *os = &x->smpls[0]; + Truth *ip1; + Truth iv2; + Truth *op; + int n; + int i; + + EVALARG(arg1) + EVALARG(arg2) + ROTATE(x) + + if (arg1->valid && arg2->valid) { + ip1 = (Truth *)is1->ptr; + iv2 = *(Truth *)is2->ptr; + op = (Truth *)os->ptr; + n = x->tspan; + for (i = 0; i < n; i++) { + *op++ = AND(*ip1, iv2); + ip1++; + } + os->stamp = (is1->stamp > is2->stamp) ? is1->stamp : is2->stamp; + x->valid++; + } + else if (arg1->valid) { + Truth answer = DUNNO; + + ip1 = (Truth *)is1->ptr; + op = (Truth *)os->ptr; + n = x->tspan; + for (i = 0; i < n; i++) { + if ((*op++ = AND1(*ip1)) == FALSE) + answer = FALSE; + ip1++; + } + if (answer == FALSE) { + os->stamp = is1->stamp; + x->valid++; + } + else x->valid = 0; + } + else if (arg2->valid) { + if ((*(Truth *)os->ptr = AND1(*(Truth *)is2->ptr)) == FALSE) { + os->stamp = is2->stamp; + x->valid++; + } + else x->valid = 0; + } + else x->valid = 0; + +#if PCP_DEBUG + if (pmDebug & DBG_TRACE_APPL2) { + fprintf(stderr, "cndAnd_n_1(" PRINTF_P_PFX "%p) ...\n", x); + dumpExpr(x); + } +#endif +} + +void +cndAnd_1_n(Expr *x) +{ + Expr *arg1 = x->arg1; + Expr *arg2 = x->arg2; + Sample *is1 = &arg1->smpls[0]; + Sample *is2 = &arg2->smpls[0]; + Sample *os = &x->smpls[0]; + Truth iv1; + Truth *ip2; + Truth *op; + int n; + int i; + + EVALARG(arg1) + EVALARG(arg2) + ROTATE(x) + + if (arg1->valid && arg2->valid) { + iv1 = *(Truth *)is1->ptr; + ip2 = (Truth *)is2->ptr; + op = (Truth *)os->ptr; + n = x->tspan; + for (i = 0; i < n; i++) { + *op++ = AND(iv1, *ip2); + ip2++; + } + os->stamp = (is1->stamp > is2->stamp) ? is1->stamp : is2->stamp; + x->valid++; + } + else if (arg1->valid) { + if ((*(Truth *)os->ptr = AND1(*(Truth *)is1->ptr)) == FALSE) { + os->stamp = is1->stamp; + x->valid++; + } + else x->valid = 0; + } + else if (arg2->valid) { + Truth answer = DUNNO; + + ip2 = (Truth *)is2->ptr; + op = (Truth *)os->ptr; + n = x->tspan; + for (i = 0; i < n; i++) { + if ((*op++ = AND1(*ip2)) == FALSE) + answer = FALSE; + ip2++; + } + if (answer == FALSE) { + os->stamp = is2->stamp; + x->valid++; + } + else x->valid = 0; + } + else x->valid = 0; + +#if PCP_DEBUG + if (pmDebug & DBG_TRACE_APPL2) { + fprintf(stderr, "cndAnd_1_n(" PRINTF_P_PFX "%p) ...\n", x); + dumpExpr(x); + } +#endif +} + +void +cndAnd_1_1(Expr *x) +{ + Expr *arg1 = x->arg1; + Expr *arg2 = x->arg2; + Sample *is1 = &arg1->smpls[0]; + Sample *is2 = &arg2->smpls[0]; + Sample *os = &x->smpls[0]; + + EVALARG(arg1) + EVALARG(arg2) + ROTATE(x) + + if (arg1->valid && arg2->valid) { + *(Truth *)os->ptr = AND(*(Truth *)is1->ptr, *(Truth *)is2->ptr); + os->stamp = (is1->stamp > is2->stamp) ? is1->stamp : is2->stamp; + x->valid++; + } + else if (arg1->valid) { + if ((*(Truth *)os->ptr = AND1(*(Truth *)is1->ptr)) == FALSE) { + os->stamp = is1->stamp; + x->valid++; + } + else x->valid = 0; + } + else if (arg2->valid) { + if ((*(Truth *)os->ptr = AND1(*(Truth *)is2->ptr)) == FALSE) { + os->stamp = is2->stamp; + x->valid++; + } + else x->valid = 0; + } + else x->valid = 0; + +#if PCP_DEBUG + if (pmDebug & DBG_TRACE_APPL2) { + fprintf(stderr, "cndAnd_1_1(" PRINTF_P_PFX "%p) ...\n", x); + dumpExpr(x); + } +#endif +} diff --git a/src/pmie/src/conjunct.h b/src/pmie/src/conjunct.h new file mode 100644 index 0000000..a12130b --- /dev/null +++ b/src/pmie/src/conjunct.h @@ -0,0 +1,37 @@ +/*********************************************************************** + * conjunt.h - Logical AND/OR expression evaluator functions + *********************************************************************** + * + * Copyright (c) 1995 Silicon Graphics, Inc. All Rights Reserved. + * + * This program is free software; you can redistribute it and/or modify it + * under the terms of the GNU General Public License as published by the + * Free Software Foundation; either version 2 of the License, or (at your + * option) any later version. + * + * This program is distributed in the hope that it will be useful, but + * WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY + * or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License + * for more details. + * + * You should have received a copy of the GNU General Public License along + * with this program; if not, write to the Free Software Foundation, Inc., + * 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA + * + * Contact information: Silicon Graphics, Inc., 1500 Crittenden Lane, + * Mountain View, CA 94043, USA, or: http://www.sgi.com + */ +#ifndef CONJUNCT_H +#define CONJUNCT_H + +/* expression evaluator function prototypes */ +void cndOr_n_n(Expr *); +void cndOr_1_n(Expr *); +void cndOr_n_1(Expr *); +void cndOr_1_1(Expr *); +void cndAnd_n_n(Expr *); +void cndAnd_1_n(Expr *); +void cndAnd_n_1(Expr *); +void cndAnd_1_1(Expr *); + +#endif /* CONJUNCT_H */ diff --git a/src/pmie/src/fun.h b/src/pmie/src/fun.h index ba26ea6..fd0b128 100644 --- a/src/pmie/src/fun.h +++ b/src/pmie/src/fun.h @@ -28,6 +28,7 @@ #define FUN_H #include "dstruct.h" +#include "conjunct.h" #define ROTATE(x) if ((x)->nsmpls > 1) rotate(x); #define EVALARG(x) if ((x)->op < NOP) ((x)->eval)(x); @@ -101,14 +102,6 @@ void cndRise_n(Expr *); void cndRise_1(Expr *); void cndFall_n(Expr *); void cndFall_1(Expr *); -void cndAnd_n_n(Expr *); -void cndAnd_1_n(Expr *); -void cndAnd_n_1(Expr *); -void cndAnd_1_1(Expr *); -void cndOr_n_n(Expr *); -void cndOr_1_n(Expr *); -void cndOr_n_1(Expr *); -void cndOr_1_1(Expr *); void cndMatch_inst(Expr *); void cndAll_host(Expr *); void cndAll_inst(Expr *); diff --git a/src/pmie/src/meta b/src/pmie/src/meta index 3955602..582c078 100755 --- a/src/pmie/src/meta +++ b/src/pmie/src/meta @@ -237,14 +237,6 @@ fun=cndNot op="OP(x) (((x) == TRUE || (x) == FALSE) ? !(x) : DUNNO)" _unary -fun=cndAnd -op="OP(x,y) (((x) == TRUE \\&\\& (y) == TRUE) ? TRUE : (((x) == FALSE || (y) == FALSE) ? FALSE : DUNNO))" -_binary - -fun=cndOr -op="OP(x,y) (((x) == TRUE || (y) == TRUE) ? TRUE : (((x) == FALSE \\&\\& (y) == FALSE) ? FALSE : DUNNO))" -_binary - fun=cndRise delta="" op=">" --=-PRaSo4PlXKLlZWTRzI1T-- From nscott@aconex.com Tue Jun 19 15:56:19 2007 Received: with ECARTIS (v1.0.0; list pcp); Tue, 19 Jun 2007 15:56:24 -0700 (PDT) Received: from postoffice.aconex.com (mail.app.aconex.com [203.89.192.138]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id l5JMuIdo032359 for ; Tue, 19 Jun 2007 15:56:19 -0700 Received: from edge.local (unknown [203.89.192.141]) by postoffice.aconex.com (Postfix) with ESMTP id 3D3BE92C3E5 for ; Wed, 20 Jun 2007 08:56:19 +1000 (EST) Subject: kmchart updates From: Nathan Scott Reply-To: nscott@aconex.com To: pcp@oss.sgi.com Content-Type: text/plain Organization: Aconex Date: Wed, 20 Jun 2007 08:54:56 +1000 Message-Id: <1182293696.4249.23.camel@edge.yarra.acx> Mime-Version: 1.0 X-Mailer: Evolution 2.6.3 Content-Transfer-Encoding: 7bit X-archive-position: 1278 X-ecartis-version: Ecartis v1.0.0 Sender: pcp-bounce@oss.sgi.com Errors-to: pcp-bounce@oss.sgi.com X-original-sender: nscott@aconex.com Precedence: bulk X-list: pcp Changes committed to git://oss.sgi.com:8090/nathans/kmchart.git README | 3 +++ src/time/GNUmakefile | 2 +- src/time/constants.h | 29 ----------------------------- src/time/kmtimearch.ui.h | 36 ++++++++++++++++++++++++------------ src/time/main.h | 10 +++++----- 5 files changed, 33 insertions(+), 47 deletions(-) commit f2a797c47e9f84e8f93a80e5c556aa9471604e6a Merge: f4ad865... 9aa2431... Author: Nathan Scott Date: Wed Jun 20 08:50:13 2007 +1000 Merge leaf:/source/git/kmchart/ commit 9aa2431ebff0bf0addf097acdf2e441e87906e85 Author: Nathan Scott Date: Wed Jun 20 08:28:35 2007 +1000 Default/min/max kmtime speed now scales based on the sample interval. commit f4ad865a657662243b2c157aaf010a089d76432f Author: Nathan Scott Date: Tue Jun 19 11:16:53 2007 +1000 Add another note to the README about record mode. From nscott@aconex.com Wed Jun 20 15:27:00 2007 Received: with ECARTIS (v1.0.0; list pcp); Wed, 20 Jun 2007 15:27:05 -0700 (PDT) Received: from postoffice.aconex.com (mail.app.aconex.com [203.89.192.138]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id l5KMQvdo029065 for ; Wed, 20 Jun 2007 15:27:00 -0700 Received: from edge.local (unknown [203.89.192.141]) by postoffice.aconex.com (Postfix) with ESMTP id 14F4692C40A for ; Thu, 21 Jun 2007 08:26:57 +1000 (EST) Subject: kmchart updates From: Nathan Scott Reply-To: nscott@aconex.com To: pcp@oss.sgi.com Content-Type: text/plain Organization: Aconex Date: Thu, 21 Jun 2007 08:25:36 +1000 Message-Id: <1182378336.4249.71.camel@edge.yarra.acx> Mime-Version: 1.0 X-Mailer: Evolution 2.6.3 Content-Transfer-Encoding: 7bit X-archive-position: 1279 X-ecartis-version: Ecartis v1.0.0 Sender: pcp-bounce@oss.sgi.com Errors-to: pcp-bounce@oss.sgi.com X-original-sender: nscott@aconex.com Precedence: bulk X-list: pcp Changes committed to git://oss.sgi.com:8090/nathans/kmchart.git src/chart/main.cpp | 8 src/chart/main.h | 1 src/chart/settingsdialog.ui.h | 4 src/chart/tabdialog.ui | 490 +++++++++++++++++++++--------------------- src/chart/tabdialog.ui.h | 8 5 files changed, 260 insertions(+), 251 deletions(-) commit e1df90a6d7a549d2228c245db1c7d6277dd80791 Author: Nathan Scott Date: Thu Jun 21 08:00:35 2007 +1000 Fix window resizing issues on the EditTab dialog. commit b3d9ca7ddd5eb21e7f263c77397ad9c64481d27a Author: Nathan Scott Date: Thu Jun 21 07:38:16 2007 +1000 Its difficult to draw a line between less than two points... From kimbrr@sgi.com Tue Jun 26 17:13:46 2007 Received: with ECARTIS (v1.0.0; list pcp); Tue, 26 Jun 2007 17:13:53 -0700 (PDT) Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with SMTP id l5R0DhtL016824 for ; Tue, 26 Jun 2007 17:13:44 -0700 Received: from snort.melbourne.sgi.com (snort.melbourne.sgi.com [134.14.54.149]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id KAA11575 for ; Wed, 27 Jun 2007 10:13:44 +1000 Received: from snort.melbourne.sgi.com (localhost [127.0.0.1]) by snort.melbourne.sgi.com (SGI-8.12.5/8.12.5) with ESMTP id l5R0DheW2188263 for ; Wed, 27 Jun 2007 10:13:43 +1000 (AEST) Received: from localhost (kimbrr@localhost) by snort.melbourne.sgi.com (SGI-8.12.5/8.12.5/Submit) with ESMTP id l5R0Dgba2188969 for ; Wed, 27 Jun 2007 10:13:42 +1000 (AEST) X-Authentication-Warning: snort.melbourne.sgi.com: kimbrr owned process doing -bs Date: Wed, 27 Jun 2007 10:13:42 +1000 From: Michael Newton X-X-Sender: kimbrr@snort.melbourne.sgi.com To: pcp@oss.sgi.com Subject: Review: PCP & pmlogger take too long to start Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-archive-position: 1280 X-ecartis-version: Ecartis v1.0.0 Sender: pcp-bounce@oss.sgi.com Errors-to: pcp-bounce@oss.sgi.com X-original-sender: kimbrr@sgi.com Precedence: bulk X-list: pcp This is a review request. PCP takes >12s to start, and pmlogger_check >10s (in the cases where its actually trying to start a pmlogger). On Tue, 26 Jun 2007, markgw@sgi.com via BugWorks wrote: > please post a summary of what you're intending to fix as a review > request to pcp@oss.sgi.com (and to this PV). ok, here it is. This is the 1st time I've gone this way.. i assume i'll commit as soon as i get a (favourable) review someone? =========================================================================== mgmt/pcp/src/pmcd/src/agent.c =========================================================================== --- a/mgmt/pcp/src/pmcd/src/agent.c 2007-06-26 17:31:20.000000000 +1000 +++ b/mgmt/pcp/src/pmcd/src/agent.c 2007-06-26 14:28:22.912602167 +1000 @@ -166,7 +166,7 @@ found = 0; for ( i = 0; i < nAgents; i++) { ap = &agent[i]; - if (!ap->status.connected) + if (!ap->status.connected || ap->ipcType == AGENT_DSO) continue; found = 1; =========================================================================== mgmt/pcp/src/pmie/pmie_check.sh =========================================================================== --- a/mgmt/pcp/src/pmie/pmie_check.sh 2007-06-26 17:31:20.000000000 +1000 +++ b/mgmt/pcp/src/pmie/pmie_check.sh 2007-06-26 17:30:37.324129334 +1000 @@ -286,7 +286,7 @@ then : else - sleep 5 + sleep 1 $VERBOSE && echo " done" return 0 fi @@ -313,8 +313,8 @@ return 1 fi fi - sleep 5 - i=`expr $i + 5` + sleep 1 + i=`expr $i + 1` done $VERBOSE || _message restart echo " timed out waiting!" =========================================================================== mgmt/pcp/src/pmlogctl/pmlogger_check.sh =========================================================================== --- a/mgmt/pcp/src/pmlogctl/pmlogger_check.sh 2007-06-26 17:31:20.000000000 +1000 +++ b/mgmt/pcp/src/pmlogctl/pmlogger_check.sh 2007-06-26 17:26:40.439471251 +1000 @@ -207,7 +207,7 @@ then : else - sleep 5 + sleep 1 $VERBOSE && echo " done" return 0 fi @@ -244,8 +244,8 @@ return 1 fi fi - sleep 5 - i=`expr $i + 5` + sleep 1 + i=`expr $i + 1` done $VERBOSE || _message restart echo " timed out waiting!" =========================================================================== mgmt/pcp/src/pmlogctl/pmnewlog.sh =========================================================================== --- a/mgmt/pcp/src/pmlogctl/pmnewlog.sh 2007-06-26 17:31:20.000000000 +1000 +++ b/mgmt/pcp/src/pmlogctl/pmnewlog.sh 2007-06-26 17:27:52.761899495 +1000 @@ -106,7 +106,7 @@ then : else - sleep 5 + sleep 1 $VERBOSE && echo " done" return 0 fi @@ -120,8 +120,8 @@ _check_logfile return 1 fi - sleep 5 - i=`expr $i + 5` + sleep 1 + i=`expr $i + 1` done $VERBOSE || _message restart echo " timed out waiting!" -- Dr.Michael("Kimba")Newton kimbrr@sgi.com From kimbrr@sgi.com Tue Jun 26 18:25:46 2007 Received: with ECARTIS (v1.0.0; list pcp); Tue, 26 Jun 2007 18:25:51 -0700 (PDT) Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with SMTP id l5R1PgtL031549 for ; Tue, 26 Jun 2007 18:25:44 -0700 Received: from snort.melbourne.sgi.com (snort.melbourne.sgi.com [134.14.54.149]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id LAA13366 for ; Wed, 27 Jun 2007 11:25:43 +1000 Received: from snort.melbourne.sgi.com (localhost [127.0.0.1]) by snort.melbourne.sgi.com (SGI-8.12.5/8.12.5) with ESMTP id l5R1PgeW2231968 for ; Wed, 27 Jun 2007 11:25:43 +1000 (AEST) Received: from localhost (kimbrr@localhost) by snort.melbourne.sgi.com (SGI-8.12.5/8.12.5/Submit) with ESMTP id l5R1Pfln2231226 for ; Wed, 27 Jun 2007 11:25:42 +1000 (AEST) X-Authentication-Warning: snort.melbourne.sgi.com: kimbrr owned process doing -bs Date: Wed, 27 Jun 2007 11:25:41 +1000 From: Michael Newton X-X-Sender: kimbrr@snort.melbourne.sgi.com To: pcp@oss.sgi.com Subject: Re: Review: PCP & pmlogger take too long to start In-Reply-To: Message-ID: References: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-archive-position: 1281 X-ecartis-version: Ecartis v1.0.0 Sender: pcp-bounce@oss.sgi.com Errors-to: pcp-bounce@oss.sgi.com X-original-sender: kimbrr@sgi.com Precedence: bulk X-list: pcp i wrote: > This is a review request. PCP takes >12s to start, and pmlogger_check > >10s (in the cases where its actually trying to start a pmlogger). hold up... its still taking 3s each.. im going to try to do better Dr.Michael("Kimba")Newton kimbrr@sgi.com From kimbrr@sgi.com Wed Jun 27 01:00:44 2007 Received: with ECARTIS (v1.0.0; list pcp); Wed, 27 Jun 2007 01:00:51 -0700 (PDT) Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with SMTP id l5R80dtL013337 for ; Wed, 27 Jun 2007 01:00:42 -0700 Received: from snort.melbourne.sgi.com (snort.melbourne.sgi.com [134.14.54.149]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id SAA23065 for ; Wed, 27 Jun 2007 18:00:39 +1000 Received: from snort.melbourne.sgi.com (localhost [127.0.0.1]) by snort.melbourne.sgi.com (SGI-8.12.5/8.12.5) with ESMTP id l5R80deW2410422 for ; Wed, 27 Jun 2007 18:00:39 +1000 (AEST) Received: from localhost (kimbrr@localhost) by snort.melbourne.sgi.com (SGI-8.12.5/8.12.5/Submit) with ESMTP id l5R80bSk2411118 for ; Wed, 27 Jun 2007 18:00:39 +1000 (AEST) X-Authentication-Warning: snort.melbourne.sgi.com: kimbrr owned process doing -bs Date: Wed, 27 Jun 2007 18:00:37 +1000 From: Michael Newton X-X-Sender: kimbrr@snort.melbourne.sgi.com To: pcp@oss.sgi.com Subject: Re: Review: PCP & pmlogger take too long to start In-Reply-To: Message-ID: References: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-archive-position: 1282 X-ecartis-version: Ecartis v1.0.0 Sender: pcp-bounce@oss.sgi.com Errors-to: pcp-bounce@oss.sgi.com X-original-sender: kimbrr@sgi.com Precedence: bulk X-list: pcp Ready! Review please.. On Wed, 27 Jun 2007, Michael Newton wrote: > i wrote: > > This is a review request. PCP takes >12s to start, and pmlogger_check > > >10s (in the cases where its actually trying to start a pmlogger). > > hold up... > > its still taking 3s each.. im going to try to do better In a number of cases ive moved loop iteration tests into the body.. this is because * its a good idea to try the target condition before the first sleep * its a good idea to try the target condition after the last sleep This is further complicated by not wanting to print a progress meter every tenth of a second On my test box pcp stop is now about 0.3s. By itself, a pmlogger_check doing an actual launch takes about 1.5s, but immediately following pcp restart its >3s. =========================================================================== mgmt/pcp/src/pmcd/rc_pcp =========================================================================== --- a/mgmt/pcp/src/pmcd/rc_pcp 2007-06-27 17:49:39.000000000 +1000 +++ b/mgmt/pcp/src/pmcd/rc_pcp 2007-06-27 17:49:35.029718805 +1000 @@ -100,6 +100,22 @@ ;; esac +# got usleep ? +SLEEPCMND=`which usleep 2>/dev/null | $PCP_AWK_PROG ' +BEGIN { i = 0 } +/ not in / { i = 1 } +/ aliased to / { i = 1 } + { if ( i == 0 ) print } +'` +if [ -z "$SLEEPCMND" ] +then + SLEEPCMND="sleep 1" + SLEEPINTVL=10 #tenths of a sec +else + SLEEPINTVL=1 #tenths of a sec + SLEEPCMND="$SLEEPCMND 100000" +fi + _pmcd_logfile() { default=$RUNDIR/pmcd.log @@ -383,16 +399,25 @@ fi $ECHO $PCP_ECHO_N "Waiting for PMCD to terminate ...""$PCP_ECHO_C" gone=0 - for i in 1 2 3 4 5 6 + i=0 + j=0 + while : do - sleep 3 _get_pids_by_name pmcd >$tmp.tmp if [ ! -s $tmp.tmp ] then gone=1 break fi - $ECHO $PCP_ECHO_N ".""$PCP_ECHO_C" + i=`expr $i + $SLEEPINTVL` + if [ $i -ge 10 ] + then + i=0 + [ $j -ge $delay ] && break + j=`expr $j + 1` + $ECHO $PCP_ECHO_N ".""$PCP_ECHO_C" + fi + $SLEEPCMND done if [ $gone != 1 ] # It just WON'T DIE, give up. then =========================================================================== mgmt/pcp/src/pmcd/src/agent.c =========================================================================== --- a/mgmt/pcp/src/pmcd/src/agent.c 2007-06-27 17:49:39.000000000 +1000 +++ b/mgmt/pcp/src/pmcd/src/agent.c 2007-06-26 14:28:22.912602167 +1000 @@ -166,7 +166,7 @@ found = 0; for ( i = 0; i < nAgents; i++) { ap = &agent[i]; - if (!ap->status.connected) + if (!ap->status.connected || ap->ipcType == AGENT_DSO) continue; found = 1; =========================================================================== mgmt/pcp/src/pmie/pmie_check.sh =========================================================================== --- a/mgmt/pcp/src/pmie/pmie_check.sh 2007-06-27 17:49:39.000000000 +1000 +++ b/mgmt/pcp/src/pmie/pmie_check.sh 2007-06-27 17:37:52.712287915 +1000 @@ -14,6 +14,22 @@ PMIE=pmie +# got usleep ? +SLEEPCMND=`which usleep 2>/dev/null | $PCP_AWK_PROG ' +BEGIN { i = 0 } +/ not in / { i = 1 } +/ aliased to / { i = 1 } + { if ( i == 0 ) print } +'` +if [ -z "$SLEEPCMND" ] +then + SLEEPCMND="sleep 1" + SLEEPINTVL=10 #tenths of a sec +else + SLEEPINTVL=1 #tenths of a sec + SLEEPCMND="$SLEEPCMND 100000" +fi + # added to handle problem when /var/log/pcp is a symlink, as first # reported by Micah_Altman@harvard.edu in Nov 2001 # @@ -146,7 +162,8 @@ # fail=true rm -f $tmp.stamp - for try in 1 2 3 4 + i=0 + while : do if pmlock -v $logfile.lock >$tmp.out then @@ -165,7 +182,9 @@ rm -f $logfile.lock fi fi - sleep 5 + [ $i -ge 200 ] && break #tenths of a sec + $SLEEPCMND + i=`expr $i + $SLEEPINTVL` done if $fail @@ -272,9 +291,9 @@ # delay=`expr $delay + 20 \* $x` i=0 - while [ $i -lt $delay ] + j=0 + while : do - $VERBOSE && $PCP_ECHO_PROG $PCP_ECHO_N ".""$PCP_ECHO_C" if [ -f $logfile ] then # $logfile was previously removed, if it has appeared again then @@ -286,7 +305,7 @@ then : else - sleep 5 + $SLEEPCMND $VERBOSE && echo " done" return 0 fi @@ -313,8 +332,15 @@ return 1 fi fi - sleep 5 - i=`expr $i + 5` + i=`expr $i + $SLEEPINTVL` + if [ $i -ge 10 ] + then + i=0 + [ $j -ge $delay ] && break + j=`expr $j + 1` + $VERBOSE && $PCP_ECHO_PROG $PCP_ECHO_N ".""$PCP_ECHO_C" + fi + $SLEEPCMND done $VERBOSE || _message restart echo " timed out waiting!" @@ -630,13 +656,20 @@ then $VERY_VERBOSE && ( echo; $PCP_ECHO_PROG $PCP_ECHO_N "+ $KILL -KILL `cat $tmp.pmies` ...""$PCP_ECHO_C" ) eval $KILL -KILL $pmielist >/dev/null 2>&1 - sleep 3 # give them a chance to go - if ps -f -p "$pmielist" >$tmp.alive 2>&1 - then + i=0 + while ps -f -p "$pmielist" >$tmp.alive 2>&1 + do + if [ $i -lt 30 ] + then + $SLEEPCMND + i=`expr $i + $SLEEPINTVL` + continue; + fi echo "$prog: Error: pmie process(es) will not die" cat $tmp.alive status=1 - fi + break + done fi fi =========================================================================== mgmt/pcp/src/pmlogctl/pmlogger_check.sh =========================================================================== --- a/mgmt/pcp/src/pmlogctl/pmlogger_check.sh 2007-06-27 17:49:39.000000000 +1000 +++ b/mgmt/pcp/src/pmlogctl/pmlogger_check.sh 2007-06-27 17:37:24.843964362 +1000 @@ -51,6 +51,22 @@ PWDCMND=/bin/pwd fi +# got usleep ? +SLEEPCMND=`which usleep 2>/dev/null | $PCP_AWK_PROG ' +BEGIN { i = 0 } +/ not in / { i = 1 } +/ aliased to / { i = 1 } + { if ( i == 0 ) print } +'` +if [ -z "$SLEEPCMND" ] +then + SLEEPCMND="sleep 1" + SLEEPINTVL=10 #tenths of a sec +else + SLEEPINTVL=1 #tenths of a sec + SLEEPCMND="$SLEEPCMND 100000" +fi + # default location # logfile=pmlogger.log @@ -194,9 +210,9 @@ # delay=`expr $delay + 20 \* $x` i=0 - while [ $i -lt $delay ] + j=0 + while : do - $VERBOSE && $PCP_ECHO_PROG $PCP_ECHO_N ".""$PCP_ECHO_C" if [ -f $logfile ] then # $logfile was previously removed, if it has appeared again @@ -207,7 +223,7 @@ then : else - sleep 5 + $SLEEPCMND $VERBOSE && echo " done" return 0 fi @@ -244,8 +260,15 @@ return 1 fi fi - sleep 5 - i=`expr $i + 5` + i=`expr $i + $SLEEPINTVL` + if [ $i -ge 10 ] + then + i=0 + [ $j -ge $delay ] && break + j=`expr $j + 1` + $VERBOSE && $PCP_ECHO_PROG $PCP_ECHO_N ".""$PCP_ECHO_C" + fi + $SLEEPCMND done $VERBOSE || _message restart echo " timed out waiting!" @@ -379,7 +402,8 @@ # fail=true rm -f $tmp.stamp - for try in 1 2 3 4 + i=0 + while : do if pmlock -v lock >$tmp.out then @@ -407,7 +431,9 @@ rm -f lock fi fi - sleep 5 + [ $i -ge 200 ] && break #tenths of a sec + $SLEEPCMND + i=`expr $i + $SLEEPINTVL` done if $fail -- Dr.Michael("Kimba")Newton kimbrr@sgi.com From nscott@aconex.com Wed Jun 27 15:51:28 2007 Received: with ECARTIS (v1.0.0; list pcp); Wed, 27 Jun 2007 15:51:34 -0700 (PDT) Received: from postoffice.aconex.com (mail.app.aconex.com [203.89.192.138]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id l5RMpQtL028953 for ; Wed, 27 Jun 2007 15:51:28 -0700 Received: from edge.yarra.acx (unknown [203.89.192.141]) by postoffice.aconex.com (Postfix) with ESMTP id 27A3592C4B1 for ; Thu, 28 Jun 2007 08:51:27 +1000 (EST) Subject: kmchart updates From: Nathan Scott Reply-To: nscott@aconex.com To: pcp@oss.sgi.com Content-Type: text/plain Organization: Aconex Date: Thu, 28 Jun 2007 08:50:24 +1000 Message-Id: <1182984624.15488.81.camel@edge.yarra.acx> Mime-Version: 1.0 X-Mailer: Evolution 2.6.3 Content-Transfer-Encoding: 7bit X-archive-position: 1283 X-ecartis-version: Ecartis v1.0.0 Sender: pcp-bounce@oss.sgi.com Errors-to: pcp-bounce@oss.sgi.com X-original-sender: nscott@aconex.com Precedence: bulk X-list: pcp Changes committed to git://oss.sgi.com:8090/nathans/kmchart.git README | 19 +-- images/aboutqt.png |binary images/aboutqt.svg | 114 +++++++++++++++++++++++ src/chart/GNUmakefile | 2 src/chart/aboutdialog.ui | 59 +++++++----- src/chart/aboutdialog.ui.h | 3 src/chart/aboutpcpdialog.ui | 138 ---------------------------- src/chart/aboutpcpdialog.ui.h | 17 --- src/chart/chart.cpp | 8 - src/chart/kmchart.pro | 10 -- src/chart/kmchart.ui | 116 ++++++++++-------------- src/chart/kmchart.ui.h | 13 +- src/chart/seealsodialog.ui | 203 ++++++++++++++++++++++++++++++++++++++++++ src/chart/seealsodialog.ui.h | 16 +++ src/chart/settingsdialog.ui | 12 -- src/chart/tab.cpp | 36 ++++--- src/chart/tab.h | 7 - src/chart/tabdialog.ui | 12 ++ src/chart/tabdialog.ui.h | 19 +-- src/chart/view.cpp | 2 src/time/aboutdialog.ui | 56 +++++------ src/time/aboutdialog.ui.h | 2 src/time/aboutpcpdialog.ui | 153 ------------------------------- src/time/aboutpcpdialog.ui.h | 16 --- src/time/kmtime.pro | 18 +-- src/time/kmtimearch.ui | 21 +--- src/time/kmtimearch.ui.h | 8 - src/time/kmtimelive.ui | 21 +--- src/time/kmtimelive.ui.h | 8 - src/time/main.h | 2 src/time/seealsodialog.ui | 203 ++++++++++++++++++++++++++++++++++++++++++ src/time/seealsodialog.ui.h | 16 +++ 33 files changed, 777 insertions(+), 553 deletions(-) commit 83626fdb5386cdbf60f269f3e4202181f322b8c3 Author: Nathan Scott Date: Thu Jun 28 08:42:57 2007 +1000 Add scalable vector graphic file for the Qt image. commit fe85ec175ce61795a8e7e5ed045ce413ff1a3f4d Author: Nathan Scott Date: Thu Jun 28 08:08:46 2007 +1000 Rework acknowledgements dialog, add references to Qt and Qwt too. commit acf278988ffe07940081432183241d421ef90d87 Author: Nathan Scott Date: Tue Jun 26 07:26:11 2007 +1000 Update .ui file so QtDesigner knows about the TimeAxis widget. commit a3f25d44146d28fa1e392c3643b852b54c2b326e Author: Nathan Scott Date: Tue Jun 26 07:18:49 2007 +1000 Update TODO list after some recent bug fixes. commit 897a382ac71cda07fa47026ef1a347313751fb6c Author: Nathan Scott Date: Fri Jun 22 20:18:40 2007 +1000 Fix visibleHistory used in setRawData call, was incorrectly clobbering a local. commit a6242d6738a9c043e3c9b1d5d5c595841b866930 Author: Nathan Scott Date: Fri Jun 22 20:16:49 2007 +1000 Make fastfwd replay twice as fast, scrolls nicely on my laptop now. commit a24053397d8daa1bb7a8857b71129d6c4602feef Author: Nathan Scott Date: Fri Jun 22 08:17:55 2007 +1000 Fix the Tabdialog init values, and increment to work like Settingsdialog. commit bcf538dc231814053e5e6bab1cac7b769477c889 Author: Nathan Scott Date: Thu Jun 21 14:35:58 2007 +1000 Small cleanups to calls to setSizePolicy (remove redundant args). commit 5c11dc48a1c2694abab4c19f2f3b9838203e5210 Author: Nathan Scott Date: Thu Jun 21 10:09:06 2007 +1000 Use a QSplitter between charts, allowing independent chart vertical resizing. commit 4ea2741bfbd6864e230bf77a751899dca6ccd07b Author: Nathan Scott Date: Thu Jun 21 10:05:23 2007 +1000 Update README about pmproxy changes needed in libpcp. From nscott@aconex.com Wed Jun 27 19:03:10 2007 Received: with ECARTIS (v1.0.0; list pcp); Wed, 27 Jun 2007 19:03:16 -0700 (PDT) Received: from postoffice.aconex.com (mail.app.aconex.com [203.89.192.138]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id l5S238tL002999 for ; Wed, 27 Jun 2007 19:03:09 -0700 Received: from edge.yarra.acx (unknown [203.89.192.141]) by postoffice.aconex.com (Postfix) with ESMTP id 7CCAC92C493; Thu, 28 Jun 2007 12:03:09 +1000 (EST) Subject: Re: Review: PCP & pmlogger take too long to start From: Nathan Scott Reply-To: nscott@aconex.com To: Michael Newton Cc: pcp@oss.sgi.com In-Reply-To: References: Content-Type: text/plain Organization: Aconex Date: Thu, 28 Jun 2007 12:02:07 +1000 Message-Id: <1182996127.15488.102.camel@edge.yarra.acx> Mime-Version: 1.0 X-Mailer: Evolution 2.6.3 Content-Transfer-Encoding: 7bit X-archive-position: 1284 X-ecartis-version: Ecartis v1.0.0 Sender: pcp-bounce@oss.sgi.com Errors-to: pcp-bounce@oss.sgi.com X-original-sender: nscott@aconex.com Precedence: bulk X-list: pcp On Wed, 2007-06-27 at 18:00 +1000, Michael Newton wrote: > ... > > +# got usleep ? > +SLEEPCMND=`which usleep 2>/dev/null | $PCP_AWK_PROG ' > +BEGIN { i = 0 } > +/ not in / { i = 1 } > +/ aliased to / { i = 1 } > + { if ( i == 0 ) print }and > +'` > +if [ -z "$SLEEPCMND" ] > +then > + SLEEPCMND="sleep 1" > + SLEEPINTVL=10 #tenths of a sec > +else > + SLEEPINTVL=1 #tenths of a sec > + SLEEPCMND="$SLEEPCMND 100000" > +fi Repeating this in so many scripts is a bit of a shame, and it'd be better if they were faster always (not just is usleep found). We should implement a "pmsleep" command (like we did for pmhostname) if we want this sub-second sleeper on all platforms (which we do) ... its trivial, just use nanosleep(), which is POSIX and is there on all supported PCP platforms. I still haven't found a usleep on Debian (what package is that in on your SuSE/RH boxen?), and its not there on MacOSX, so I doubt its there on Windows/Cygwin. The other alternative is that sleep(1) seems to allow sub-second sleeping these days (I've only tried the GNU tools) - but that's not standard, so we'll probably get bitten by using that. cheers. From nscott@aconex.com Fri Jun 29 00:13:54 2007 Received: with ECARTIS (v1.0.0; list pcp); Fri, 29 Jun 2007 00:14:01 -0700 (PDT) Received: from postoffice.aconex.com (mail.app.aconex.com [203.89.192.138]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id l5T7DqtL010698 for ; Fri, 29 Jun 2007 00:13:53 -0700 Received: from edge.yarra.acx (unknown [203.89.192.141]) by postoffice.aconex.com (Postfix) with ESMTP id 2477392C427 for ; Fri, 29 Jun 2007 17:13:54 +1000 (EST) Subject: pcp updates From: Nathan Scott Reply-To: nscott@aconex.com To: pcp@oss.sgi.com Content-Type: text/plain Organization: Aconex Date: Fri, 29 Jun 2007 17:12:54 +1000 Message-Id: <1183101174.15488.164.camel@edge.yarra.acx> Mime-Version: 1.0 X-Mailer: Evolution 2.6.3 Content-Transfer-Encoding: 7bit X-archive-position: 1285 X-ecartis-version: Ecartis v1.0.0 Sender: pcp-bounce@oss.sgi.com Errors-to: pcp-bounce@oss.sgi.com X-original-sender: nscott@aconex.com Precedence: bulk X-list: pcp Changes committed to git://oss.sgi.com:8090/nathans/pcp.git configure.in | 48 ++--- src/libkmtime/src/GNUmakefile | 1 src/libpcp/src/units.c | 4 src/libpcp/src/util.c | 6 src/libpcp_pmc/GNUmakefile | 2 src/libpcp_pmc/examples/GNUmakefile | 2 src/libpcp_pmc/examples/fixed.c++ | 4 src/libpcp_pmc/examples/hotproc.c++ | 4 src/libpcp_pmc/pcp/GNUmakefile | 21 ++ src/libpcp_pmc/pcp/pmc/Bool.h | 31 +++ src/libpcp_pmc/pcp/pmc/Context.h | 178 +++++++++++++++++++ src/libpcp_pmc/pcp/pmc/Desc.h | 94 ++++++++++ src/libpcp_pmc/pcp/pmc/GNUmakefile | 44 ++++ src/libpcp_pmc/pcp/pmc/Group.h | 181 +++++++++++++++++++ src/libpcp_pmc/pcp/pmc/Hash.h | 125 +++++++++++++ src/libpcp_pmc/pcp/pmc/Indom.h | 170 ++++++++++++++++++ src/libpcp_pmc/pcp/pmc/List.h | 331 ++++++++++++++++++++++++++++++++++++ src/libpcp_pmc/pcp/pmc/Metric.h | 278 ++++++++++++++++++++++++++++++ src/libpcp_pmc/pcp/pmc/PMC.h | 50 +++++ src/libpcp_pmc/pcp/pmc/Source.h | 140 +++++++++++++++ src/libpcp_pmc/pcp/pmc/String.h | 139 +++++++++++++++ src/libpcp_pmc/pcp/pmc/Vector.h | 203 ++++++++++++++++++++++ src/libpcp_pmc/src/Bool.h | 31 --- src/libpcp_pmc/src/Context.c++ | 4 src/libpcp_pmc/src/Context.h | 178 ------------------- src/libpcp_pmc/src/Desc.c++ | 2 src/libpcp_pmc/src/Desc.h | 94 ---------- src/libpcp_pmc/src/GNUmakefile | 9 src/libpcp_pmc/src/Group.c++ | 12 - src/libpcp_pmc/src/Group.h | 181 ------------------- src/libpcp_pmc/src/Hash.h | 125 ------------- src/libpcp_pmc/src/Indom.c++ | 4 src/libpcp_pmc/src/Indom.h | 170 ------------------ src/libpcp_pmc/src/List.h | 331 ------------------------------------ src/libpcp_pmc/src/Metric.c++ | 4 src/libpcp_pmc/src/Metric.h | 278 ------------------------------ src/libpcp_pmc/src/PMC.h | 50 ----- src/libpcp_pmc/src/Source.c++ | 2 src/libpcp_pmc/src/Source.h | 140 --------------- src/libpcp_pmc/src/String.c++ | 4 src/libpcp_pmc/src/String.h | 139 --------------- src/libpcp_pmc/src/Vector.h | 203 ---------------------- src/libpcp_trace/src/trace.c | 8 src/pmdas/darwin/kernel.c | 10 - src/pmdas/darwin/network.c | 1 src/pmdas/linux/proc_net_dev.c | 2 src/pmdas/linux/proc_partitions.c | 2 src/pmdas/sendmail/sendmail.c | 4 src/pmdumptext/pmdumptext.c++ | 6 src/pmlogextract/logio.c | 2 src/pmlogreduce/logio.c | 2 src/pmlogreduce/pmlogreduce.c | 3 src/pmlogreduce/rewrite.c | 5 src/pmnscomp/pmnscomp.c | 5 54 files changed, 2066 insertions(+), 2001 deletions(-) commit dc28feba37a7bf6723441fe253114a82f3fe95fc Author: Nathan Scott Date: Fri Jun 29 17:03:29 2007 +1000 Add in LDIRT for libkmtime for cleaner builds. commit b22deb64f83a527b11218b85a62fde83ca183a6d Author: Nathan Scott Date: Fri Jun 29 17:01:38 2007 +1000 Rearrange source code so libpcp_pmc compiles on Mac OS X. This change rearranges the libpcp_pmc header files to make the build work on filesystems that are case-insensitive (or can be), as is the default in Mac OS X. The problem is String.h matches the system header file string.h, both of which need to be pulled into String.c++. Including doesnt work (tried that, it looks like the compiler is too smart) either. So, I've changed the source structure to match the installed include structure, and changed any #includes of libpcp_pmc headers to specify the full header file path. commit a4e895fefdc8595757d8d726f712bff05752be61 Author: Nathan Scott Date: Fri Jun 29 16:29:00 2007 +1000 Fix compiler warnings for MacOSX kernel agent for current 10.4 versions. commit b0b421f6262e0b7300608a621fb979cb33e364be Author: Nathan Scott Date: Fri Jun 29 16:22:42 2007 +1000 Resolve path naming issues with more recent versions of autoconf. Current versions of autoconf (2.61 for example) have added another level of shell escaping and indirection to certain predefined paths like ${prefix}. One case is ${datadir}, which is now defined in terms of ${datarootdir}, which is defined in terms of ${prefix}. The sed filtering needed to be changed to correctly handle the man page variable using ${mandir} - this is two levels deep now, and we were producing the path NONE/share/man in pcp.conf before this. This change is critical for current Debian/Ubuntu and MacOSX ports. commit 2b2c72d72bca85ea7384aa7da8fbaec52b8d3644 Author: Nathan Scott Date: Tue Jun 19 11:21:10 2007 +1000 Fix compiler warnings on x86_64 platform; changes verified on i386. From kimbrr@sgi.com Fri Jun 29 01:11:39 2007 Received: with ECARTIS (v1.0.0; list pcp); Fri, 29 Jun 2007 01:11:46 -0700 (PDT) Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with SMTP id l5T8BXtL027297 for ; Fri, 29 Jun 2007 01:11:36 -0700 Received: from snort.melbourne.sgi.com (snort.melbourne.sgi.com [134.14.54.149]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id SAA07228; Fri, 29 Jun 2007 18:11:28 +1000 Received: from snort.melbourne.sgi.com (localhost [127.0.0.1]) by snort.melbourne.sgi.com (SGI-8.12.5/8.12.5) with ESMTP id l5T8BReW4848553; Fri, 29 Jun 2007 18:11:28 +1000 (AEST) Received: from localhost (kimbrr@localhost) by snort.melbourne.sgi.com (SGI-8.12.5/8.12.5/Submit) with ESMTP id l5T8BOZJ4848569; Fri, 29 Jun 2007 18:11:26 +1000 (AEST) X-Authentication-Warning: snort.melbourne.sgi.com: kimbrr owned process doing -bs Date: Fri, 29 Jun 2007 18:11:24 +1000 From: Michael Newton X-X-Sender: kimbrr@snort.melbourne.sgi.com To: Nathan Scott cc: pcp@oss.sgi.com Subject: Re: Review: PCP & pmlogger take too long to start In-Reply-To: <1182996127.15488.102.camel@edge.yarra.acx> Message-ID: References: <1182996127.15488.102.camel@edge.yarra.acx> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-archive-position: 1286 X-ecartis-version: Ecartis v1.0.0 Sender: pcp-bounce@oss.sgi.com Errors-to: pcp-bounce@oss.sgi.com X-original-sender: kimbrr@sgi.com Precedence: bulk X-list: pcp On Thu, 28 Jun 2007, Nathan Scott wrote: > On Wed, 2007-06-27 at 18:00 +1000, Michael Newton wrote: > Repeating this in so many scripts is a bit of a shame, and it'd be > better if they were faster always (not just is usleep found). We Russell said something similar.. > should implement a "pmsleep" command (like we did for pmhostname) > if we want this sub-second sleeper on all platforms (which we do) > ... its trivial, just use nanosleep(), which is POSIX and is there > on all supported PCP platforms. ok here it is: =========================================================================== mgmt/pcp/man/man1/GNUmakefile =========================================================================== --- a/mgmt/pcp/man/man1/GNUmakefile 2007-06-29 18:09:45.000000000 +1000 +++ b/mgmt/pcp/man/man1/GNUmakefile 2007-06-29 15:36:13.632867332 +1000 @@ -19,7 +19,7 @@ pmnsmerge.1 pmpost.1 pmprobe.1 pmsocks.1 pmstat.1 pmstore.1 \ pmtrace.1 pmval.1 pmdaweblog.1 pmlogsummary.1 pmdashping.1 \ pmdumptext.1 genpmda.1 pmproxy.1 pmdasummary.1 pmlogreduce.1 \ - autofsd-probe.1 pmie2col.1 telnet-probe.1 + autofsd-probe.1 pmie2col.1 telnet-probe.1 pmsleep.1 MAN_DEST = $(PCP_MAN_DIR)/man$(MAN_SECTION) LSRCFILES = $(MAN_PAGES) =========================================================================== mgmt/pcp/man/man1/pmsleep.1 =========================================================================== --- a/mgmt/pcp/man/man1/pmsleep.1 2006-06-17 00:58:24.000000000 +1000 +++ b/mgmt/pcp/man/man1/pmsleep.1 2007-06-29 15:48:28.024750676 +1000 @@ -0,0 +1,41 @@ +'\"macro stdmacro +.\" +.\" Copyright (c) 2007 Silicon Graphics, Inc. All Rights Reserved. +.\" +.\" $Id$ +.ie \(.g \{\ +.\" ... groff (hack for khelpcenter, man2html, etc.) +.TH PMSLEEP 1 "SGI" "Performance Co-Pilot" +\} +.el \{\ +.if \nX=0 .ds x} PMSLEEP 1 "SGI" "Performance Co-Pilot" +.if \nX=1 .ds x} PMSLEEP 1 "Performance Co-Pilot" +.if \nX=2 .ds x} PMSLEEP 1 "" "\&" +.if \nX=3 .ds x} PMSLEEP "" "" "\&" +.TH \*(x} +.rr X +\} +.SH NAME +\f3pmsleep\f1 \- portable subsecond-capable sleep +.\" literals use .B or \f3 +.\" arguments use .I or \f2 +.SH SYNOPSIS +.B $PCP_BINADM_DIR/pmsleep +.I interval +.SH DESCRIPTION +.B pmsleep +sleeps for +.I interval. +The +.I interval +argument follows the syntax described in +.BR PCPIntro (1) +for +.B \-t, +and in the simplest form may be an unsigned integer +or floating point constant +(the implied units in this case are seconds). + +.PP +The exit status is 0 for success, or 1 for a malformed command line. +If the underlying nanosleep fails, an errno is returned. =========================================================================== mgmt/pcp/src/GNUmakefile =========================================================================== --- a/mgmt/pcp/src/GNUmakefile 2007-06-29 18:09:45.000000000 +1000 +++ b/mgmt/pcp/src/GNUmakefile 2007-06-29 14:46:06.336727771 +1000 @@ -21,7 +21,7 @@ pmdumplog pmlogextract pmstore pmhostname pmgenmap pmlogctl \ pmlogconf pmlogsummary pmclient pmkstat pcp pmlc dbpmda \ xconfirm pmtrace pmstat pmsocks pmdas pmafm procmemstat \ - pmlogreduce genpmda pmproxy telnet-probe + pmlogreduce genpmda pmproxy telnet-probe pmsleep ifneq ($(TARGET_OS), cygwin) SUBDIRS += libpcp_pmc pmdumptext autofsd-probe =========================================================================== mgmt/pcp/src/pmcd/rc_pcp =========================================================================== --- a/mgmt/pcp/src/pmcd/rc_pcp 2007-06-29 18:09:45.000000000 +1000 +++ b/mgmt/pcp/src/pmcd/rc_pcp 2007-06-29 16:07:49.625951131 +1000 @@ -100,6 +100,8 @@ ;; esac +SLEEPCMND="$PCP_BINADM_DIR/pmsleep 0.1" + _pmcd_logfile() { default=$RUNDIR/pmcd.log @@ -383,16 +385,25 @@ fi $ECHO $PCP_ECHO_N "Waiting for PMCD to terminate ...""$PCP_ECHO_C" gone=0 - for i in 1 2 3 4 5 6 + i=0 + j=0 + while : do - sleep 3 _get_pids_by_name pmcd >$tmp.tmp if [ ! -s $tmp.tmp ] then gone=1 break fi - $ECHO $PCP_ECHO_N ".""$PCP_ECHO_C" + i=`expr $i + 1` + if [ $i -ge 10 ] + then + i=0 + [ $j -ge $delay ] && break + j=`expr $j + 1` + $ECHO $PCP_ECHO_N ".""$PCP_ECHO_C" + fi + $SLEEPCMND done if [ $gone != 1 ] # It just WON'T DIE, give up. then =========================================================================== mgmt/pcp/src/pmcd/src/agent.c =========================================================================== --- a/mgmt/pcp/src/pmcd/src/agent.c 2007-06-29 18:09:45.000000000 +1000 +++ b/mgmt/pcp/src/pmcd/src/agent.c 2007-06-26 14:28:22.912602167 +1000 @@ -166,7 +166,7 @@ found = 0; for ( i = 0; i < nAgents; i++) { ap = &agent[i]; - if (!ap->status.connected) + if (!ap->status.connected || ap->ipcType == AGENT_DSO) continue; found = 1; =========================================================================== mgmt/pcp/src/pmie/pmie_check.sh =========================================================================== --- a/mgmt/pcp/src/pmie/pmie_check.sh 2007-06-29 18:09:45.000000000 +1000 +++ b/mgmt/pcp/src/pmie/pmie_check.sh 2007-06-29 16:05:39.146867878 +1000 @@ -14,6 +14,8 @@ PMIE=pmie +SLEEPCMND="$PCP_BINADM_DIR/pmsleep 0.1" + # added to handle problem when /var/log/pcp is a symlink, as first # reported by Micah_Altman@harvard.edu in Nov 2001 # @@ -146,7 +148,8 @@ # fail=true rm -f $tmp.stamp - for try in 1 2 3 4 + i=0 + while : do if pmlock -v $logfile.lock >$tmp.out then @@ -165,7 +168,9 @@ rm -f $logfile.lock fi fi - sleep 5 + [ $i -ge 200 ] && break #tenths of a sec + $SLEEPCMND + i=`expr $i + 1` done if $fail @@ -272,9 +277,9 @@ # delay=`expr $delay + 20 \* $x` i=0 - while [ $i -lt $delay ] + j=0 + while : do - $VERBOSE && $PCP_ECHO_PROG $PCP_ECHO_N ".""$PCP_ECHO_C" if [ -f $logfile ] then # $logfile was previously removed, if it has appeared again then @@ -286,7 +291,7 @@ then : else - sleep 5 + $SLEEPCMND $VERBOSE && echo " done" return 0 fi @@ -313,8 +318,15 @@ return 1 fi fi - sleep 5 - i=`expr $i + 5` + i=`expr $i + 1` + if [ $i -ge 10 ] + then + i=0 + [ $j -ge $delay ] && break + j=`expr $j + 1` + $VERBOSE && $PCP_ECHO_PROG $PCP_ECHO_N ".""$PCP_ECHO_C" + fi + $SLEEPCMND done $VERBOSE || _message restart echo " timed out waiting!" @@ -630,13 +642,20 @@ then $VERY_VERBOSE && ( echo; $PCP_ECHO_PROG $PCP_ECHO_N "+ $KILL -KILL `cat $tmp.pmies` ...""$PCP_ECHO_C" ) eval $KILL -KILL $pmielist >/dev/null 2>&1 - sleep 3 # give them a chance to go - if ps -f -p "$pmielist" >$tmp.alive 2>&1 - then + i=0 + while ps -f -p "$pmielist" >$tmp.alive 2>&1 + do + if [ $i -lt 30 ] + then + $SLEEPCMND + i=`expr $i + 1` + continue; + fi echo "$prog: Error: pmie process(es) will not die" cat $tmp.alive status=1 - fi + break + done fi fi =========================================================================== mgmt/pcp/src/pmlogctl/pmlogger_check.sh =========================================================================== --- a/mgmt/pcp/src/pmlogctl/pmlogger_check.sh 2007-06-29 18:09:45.000000000 +1000 +++ b/mgmt/pcp/src/pmlogctl/pmlogger_check.sh 2007-06-29 16:03:21.068767724 +1000 @@ -51,6 +51,8 @@ PWDCMND=/bin/pwd fi +SLEEPCMND="$PCP_BINADM_DIR/pmsleep 0.1" + # default location # logfile=pmlogger.log @@ -194,9 +196,9 @@ # delay=`expr $delay + 20 \* $x` i=0 - while [ $i -lt $delay ] + j=0 + while : do - $VERBOSE && $PCP_ECHO_PROG $PCP_ECHO_N ".""$PCP_ECHO_C" if [ -f $logfile ] then # $logfile was previously removed, if it has appeared again @@ -207,7 +209,7 @@ then : else - sleep 5 + $SLEEPCMND $VERBOSE && echo " done" return 0 fi @@ -244,8 +246,15 @@ return 1 fi fi - sleep 5 - i=`expr $i + 5` + i=`expr $i + 1` + if [ $i -ge 10 ] + then + i=0 + [ $j -ge $delay ] && break + j=`expr $j + 1` + $VERBOSE && $PCP_ECHO_PROG $PCP_ECHO_N ".""$PCP_ECHO_C" + fi + $SLEEPCMND done $VERBOSE || _message restart echo " timed out waiting!" @@ -379,7 +388,8 @@ # fail=true rm -f $tmp.stamp - for try in 1 2 3 4 + i=0 + while : do if pmlock -v lock >$tmp.out then @@ -407,7 +417,9 @@ rm -f lock fi fi - sleep 5 + [ $i -ge 200 ] && break #tenths of a sec + $SLEEPCMND + i=`expr $i + 1` done if $fail =========================================================================== mgmt/pcp/src/pmsleep/GNUmakefile =========================================================================== --- a/mgmt/pcp/src/pmsleep/GNUmakefile 2006-06-17 00:58:24.000000000 +1000 +++ b/mgmt/pcp/src/pmsleep/GNUmakefile 2007-06-29 14:33:28.335332331 +1000 @@ -0,0 +1,25 @@ +#!gmake +# +# Copyright (c) 2007 Silicon Graphics, Inc. All Rights Reserved. +# +# $Id$ +# + +TOPDIR = ../.. +include $(TOPDIR)/src/include/builddefs + +LLDLIBS = -lpcp +CFILES = pmsleep.c +CMDTARGET = pmsleep$(EXECSUFFIX) +LDIRT = $(TARGET) + +default: $(CMDTARGET) + +include $(BUILDRULES) + +install: $(CMDTARGET) + $(INSTALL) -m 755 $(CMDTARGET) $(PCP_BINADM_DIR)/$(CMDTARGET) + +default_pcp: default + +install_pcp: install =========================================================================== mgmt/pcp/src/pmsleep/pmsleep.c =========================================================================== --- a/mgmt/pcp/src/pmsleep/pmsleep.c 2006-06-17 00:58:24.000000000 +1000 +++ b/mgmt/pcp/src/pmsleep/pmsleep.c 2007-06-29 14:59:57.087491258 +1000 @@ -0,0 +1,35 @@ +/* + * Copyright (c) 2007 Silicon Graphics, Inc. All Rights Reserved. + */ + +#include +#include +#include +#include +#include "pmapi.h" + +int +main(int argc, char **argv) +{ + struct timespec rqt; + struct timeval delta; + int r = 0; + char *msg; + + if (argc == 2) { + if (pmParseInterval(argv[1], &delta, &msg) < 0) { + fputs(msg, stderr); + free(msg); + } else { + rqt.tv_sec = delta.tv_sec; + rqt.tv_nsec = delta.tv_usec * 1000; + if (0 != nanosleep(&rqt, NULL)) + r = errno; + + exit(r); + } + } + fprintf(stderr, "Usage: pmsleep [-v] interval\n"); + exit(1); + /*NOTREACHED*/ +} Dr.Michael("Kimba")Newton kimbrr@sgi.com