pcp
[Top] [All Lists]

pcp updates - mostly pmie

To: pcp@xxxxxxxxxxx
Subject: pcp updates - mostly pmie
From: Ken McDonell <kenj@xxxxxxxxxxxxxxxx>
Date: Mon, 17 Feb 2014 17:38:56 +1100
Delivered-to: pcp@xxxxxxxxxxx
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101 Thunderbird/24.2.0
A couple of curly bug fixes for pmie in here, plus QA fallout.

Changes committed to git://oss.sgi.com/kenj/pcp.git dev

 qa/.gitignore          |    2 
 qa/115                 |    4 
 qa/115.out             |    1 
 qa/228                 |   16 
 qa/321                 |    1 
 qa/514                 |    7 
 qa/514.out.3           | 2438 +++++++++++++++++++++++++++++++++++++++++++++++++
 qa/520                 |    7 
 qa/520.out.3           |  597 +++++++++++
 qa/523                 |   17 
 qa/523.out             |   14 
 qa/523.out.1           |  497 +++++++++
 qa/523.out.2           |  499 ++++++++++
 qa/733                 |   17 
 qa/733.out             |    1 
 qa/733.out.1           |  239 ++++
 qa/733.out.2           |  240 ++++
 qa/815                 |   32 
 qa/815.out             |   12 
 qa/common.filter       |    1 
 qa/group               |    1 
 src/pmie/src/fetch.sk  |    9 
 src/pmie/src/meta      |    4 
 src/pmie/src/pmie.c    |    2 
 src/pmmgr/rc_pmmgr     |    2 
 src/pmwebapi/rc_pmwebd |    2 
 26 files changed, 4632 insertions(+), 30 deletions(-)

commit c8c1a0ce13dcf307314a5bfeb7bb9471154a3fda
Author: Ken McDonell <kenj@xxxxxxxxxxxxxxxx>
Date:   Mon Feb 17 16:18:33 2014 +1100

    pmie - another day one bug, this time in count_* operators
    
    The operand of the count_inst, count_host and count_sample operators
    is a logical expression.
    
    If the expression is set values (multiple instances for count_inst,
    multiple hosts for count_host, etc), then the code did not check for
    the tri-state value of UNKNOWN (or DUNNO internally), rather it added
    the values assuming 0 for FALSE and 1 for TRUE ... DUNNO is 2 which
    explains why the result was TWICE the size of the instance domain
    if the expressions was undefined over and instance domain.
    
    Part 2 of the bug Chandana de Silva discovered with this simple
    pmie rule:
        count_inst( match_inst "httpd" proc.psinfo.pid > 0)  > 0
        -> print "count %v";
    that reported 6 most of the time, and sometimes reported 1400+
    
    qa/815 now exercises something similar.

commit 82ab0b5504648e8cfd5fa42958af5cbbbf38875d
Author: Ken McDonell <kenj@xxxxxxxxxxxxxxxx>
Date:   Mon Feb 17 16:12:35 2014 +1100

    qa/733 - updated output after pmie bug fix
    
    Seeing some more results with defined values after the first fetch.

commit ea2c1769470715e182e53c1dee5f09ac21dd39b4
Author: Ken McDonell <kenj@xxxxxxxxxxxxxxxx>
Date:   Mon Feb 17 16:03:59 2014 +1100

    qa/523 - updated output after pmie bug fix
    
    Seeing some more results with defined values after the first fetch.

commit 7d41f28d952fe4f8a3cc8c4fc8afe11e12d15ba8
Author: Ken McDonell <kenj@xxxxxxxxxxxxxxxx>
Date:   Mon Feb 17 16:02:11 2014 +1100

    qa/520 - updated output after pmie bug fix
    
    Seeing some more results with defined values after the first fetch.

commit 10c792f36296a10ddcd2c13bbbb43af0df4bf8b1
Author: Ken McDonell <kenj@xxxxxxxxxxxxxxxx>
Date:   Mon Feb 17 15:55:19 2014 +1100

    qa/514 - updated output after pmie bug fix
    
    Seeing some more results with defined values after the first fetch.

commit d1dfe14b083f9730f5cbd973a6764dabcb953d65
Author: Ken McDonell <kenj@xxxxxxxxxxxxxxxx>
Date:   Mon Feb 17 15:46:17 2014 +1100

    pmie - day one bug in fetch logic
    
    On the first fetch and the fetch after a dynamic instance domain
    has changed membership, the values may have been incorrectly
    marked as "not valid", preventing rules being evaluated correctly.
    
    Part 1 of the bug Chandana de Silva discovered with this simple
    pmie rule:
        count_inst( match_inst "httpd" proc.psinfo.pid > 0)  > 0
        -> print "count %v";
    that reported 6 most of the time, and sometimes reported 1400+

commit 1b883a0ca8bb91fe675973a2ed21760f7cb7b72a
Author: Ken McDonell <kenj@xxxxxxxxxxxxxxxx>
Date:   Mon Feb 17 14:57:32 2014 +1100

    qa/321 - dodge permission issue
    
    "... warning cannot create stats file dir ..." message still appearing
    after pmie change to quieten this because we're using pmie -v here.
    
    Filter these lines out.

commit 6717436693b165a1a5401d3577e078a2a5aa0cc8
Author: Ken McDonell <kenj@xxxxxxxxxxxxxxxx>
Date:   Mon Feb 17 14:47:24 2014 +1100

    pmie - quieten warning after tmp dir perms change
    
    After recent changes to the mode and ownership of the
    $PCP_VAR_DIR/tmp directory, the message "... warning cannot create
    stats file dir ..."  may be emitted each time pmie is run as a
    user other than "pcp".
    
    This happens a LOT in QA.
    
    Since the message is only a warning, and the only side-effect is
    that the pmie process is not visible in the instance domain of
    the pmcd.pmie metrics (it is likely that no one but kenj cares!),
    I've suppressed the message unless one of the pmie verbose flags
    is set (-v, -V or -W).

commit 292d0374f4702db6225223b96bfb74fc9850e250
Author: Ken McDonell <kenj@xxxxxxxxxxxxxxxx>
Date:   Mon Feb 17 14:45:29 2014 +1100

    qa/228 - dodge permission issue
    
    "... warning cannot create stats file dir ..." message still appearing
    after pmie change to quieten this because we're using pmie -v here.
    
    Filter these lines out.

commit 017ec17ff3563047bb92806d69b0a9a1cfc38fbd
Author: Ken McDonell <kenj@xxxxxxxxxxxxxxxx>
Date:   Mon Feb 17 14:41:22 2014 +1100

    qa/115 - non-determinism in pmie stop init script
    
    The message "...: PMIE not running" may or may not be there ... add
    filter and strip from expected output

commit 0f0ba77214193f6a415365e9ee2db2a53a6e8a6d
Author: Ken McDonell <kenj@xxxxxxxxxxxxxxxx>
Date:   Mon Feb 17 14:39:11 2014 +1100

    qa/815 [new] - tickle pmie bug
    
    pmie bug in count_<foo> method when boolean expression is UNKNOWN
    ... thanks to Chandana de Silva for pointing out the example that
    showed this.

commit 421b195fc4197eda0c5a93488f9e6c1681186cb4
Author: Ken McDonell <kenj@xxxxxxxxxxxxxxxx>
Date:   Mon Feb 17 14:35:49 2014 +1100

    qa/common.filter - strip blank lines from pmie init script output
    
    Recent changes to the pmie init scripts seem to have introduced the
    possibility of blank lines being output ... make 'em go away so we
    don't get unwanted QA failures.
    
    Example blank line output ...
                                 <---- HERE
    /etc/init.d/pmie: Warning: Performance Co-Pilot Inference Engine (pmie) is 
disabled.
        To enable pmie, run the following as root:
         update-rc.d -f pmie remove
         update-rc.d pmie defaults 94 06

commit 4e3bc79318def74c5a2433461942465c6e6b94c5
Author: Ken McDonell <kenj@xxxxxxxxxxxxxxxx>
Date:   Sun Feb 16 16:33:19 2014 +1100

    pmwebd and pmmgr ... terser start up messages
    
    To be consistent with other PCP bits-n-bobs, we've been using terser
    messages from the init scripts, so this commit changes
    
    Performance Co-Pilot starting pmwebd (logfile is 
/var/log/pcp/pmwebd/pmwebd.log) ...
    and
    Performance Co-Pilot starting pmmgr (logfile is 
/var/log/pcp/pmmgr/pmmgr.log) ...
    
    to become
    Starting pmwebd ...
    and
    Starting pmmgr ...

<Prev in Thread] Current Thread [Next in Thread>