pcp
[Top] [All Lists]

Re: [pcp] QA Status

To: Ken McDonell <kenj@xxxxxxxxxxxxxxxx>
Subject: Re: [pcp] QA Status
From: Nathan Scott <nathans@xxxxxxxxxx>
Date: Thu, 7 Jul 2016 20:14:51 -0400 (EDT)
Cc: pcp@xxxxxxxxxxx
Delivered-to: pcp@xxxxxxxxxxx
In-reply-to: <577EC6B9.3000503@xxxxxxxxxxxxxxxx>
References: <577AF2CD.60104@xxxxxxxxxxxxxxxx> <172419962.3662696.1467677239015.JavaMail.zimbra@xxxxxxxxxx> <577EC6B9.3000503@xxxxxxxxxxxxxxxx>
Reply-to: Nathan Scott <nathans@xxxxxxxxxx>
Thread-index: cPmeEgIFutX2xW09/6k7xb36hdWnIQ==
Thread-topic: QA Status
Mornin',

----- Original Message -----
> On 05/07/16 10:07, Nathan Scott wrote:
> > ...
> > I've pushed in a fix for 722, 'twas a memory corruption problem.
> 
> Hi Nathan,
> 
> 722 is still failing on some hosts, e.g. on grundy I'm seeing this
> (running the container test by hand)
> 
> kenj@grundy-dmz:~/src/pcp/qa> TEST_SET_CONTAINER=abc012345 python
> src/test_set_source.python
> == Test ==
> Hosts: None
> Archives: None
> Container: abc012345
> Traceback (most recent call last):
>    File "src/test_set_source.python", line 93, in <module>
>      test.connect()
>    File "src/test_set_source.python", line 88, in connect
>      self.context = pmapi.pmContext.fromOptions(self.opts, sys.argv)
>    File "/usr/local/lib/python2.6/site-packages/pcp/pmapi.py", line
> 1204, in fromOptions
>      context = builder(typed, source)
>    File "/usr/local/lib/python2.6/site-packages/pcp/pmapi.py", line
> 1162, in __init__
>      raise pmErr(self._ctx, [target])
> pcp.pmapi.pmErr: Operation not supported ['local:']
> 
> And similar results on vm14 and vm21.
> 

Hmm, that's a error code from pmNewContext - different kind of problem
to the last one - what does "pmprobe -v pmcd.features" say?  I'm gonna
guess pmcd.feature.containers has value 0 on all those hosts.

If so, I suppose either a pmprobe-based _notrun() is needed here or we
could split the containers check out into a new test.

> > I saw 1108 fail once, but never again & running it in a loop isn't
> > able to hit it, so I'm wondering if its related to some state left
> > behind from an earlier test.  I'll keep digging.
> 
> Your guess seems correct ... the 1108.full records show pmnewlog trying
> to kill off TWO pmloggers ... this is badness ... I've made a change to
> pmnewlog to detect this and report the details and I hope this will
> identify where the additional process is coming from.

Sounds good.  In other news I saw a single spurious 651 failure the other
day ... so not sure we've got to the bottom of that one yet either.  :(

cheers.

--
Nathan

<Prev in Thread] Current Thread [Next in Thread>