pcp
[Top] [All Lists]

RE: [pcp] pcp 3.3.3-1 problem

To: Nathan Scott <nathans@xxxxxxxxxx>
Subject: RE: [pcp] pcp 3.3.3-1 problem
From: "Siekas, Greg" <greg.siekas@xxxxxxxxxx>
Date: Wed, 18 Aug 2010 18:04:54 -0700
Accept-language: en-US
Acceptlanguage: en-US
Cc: "pcp@xxxxxxxxxxx" <pcp@xxxxxxxxxxx>
In-reply-to: <116548332.153381282178734197.JavaMail.root@xxxxxxxxxxxxxxxxxx>
References: <E2C9DAC4C471FE4198A1E2DFA6D6D82427A2A886F1@xxxxxxxxxxxxxxxxxxxxxxxxxxxx> <116548332.153381282178734197.JavaMail.root@xxxxxxxxxxxxxxxxxx>
Thread-index: Acs/N+MXMJw2bR8+R5KNuAkLNV6UoQAAOPkg
Thread-topic: [pcp] pcp 3.3.3-1 problem
Hmmm..  I rebooted these nodes this morning and now the problem appears to have 
disappeared....
Very strange.  I am going back an looking at when 3.3.3-1 was installed, it 
could be something was out of sync in this cluster.

-----Original Message-----
From: Nathan Scott [mailto:nathans@xxxxxxxxxx] 
Sent: Wednesday, August 18, 2010 5:46 PM
To: Siekas, Greg
Cc: pcp@xxxxxxxxxxx
Subject: Re: [pcp] pcp 3.3.3-1 problem


----- "Greg Siekas" <greg.siekas@xxxxxxxxxx> wrote:

> Nathan,
> 
> Thanks for the reply, here's the details you requested.
> 
> It's failing on the kernel.pernode.cpu.nice metric.
> 

Interesting - it looks alot like a problem I fixed just
before 3.3.3 ... are you sure you are on 3.3.3 and not
3.3.2?  Can you start pmcd and send output from pcp(1)
command?

> ...
> kernel.pernode.cpu.user
>     inst [0 or "node0"] value 6776690
>     inst [1 or "node1"] value 6913290
> kernel.pernode.cpu.nice
> kernel.pernode.cpu.nice: pmFetch: IPC protocol failure
> ...

I have a x86_64 system here which is exactly this config
and its got no issues.  It *did* have issues on 3.3.2 ...
so, hopefully this is just a case of mistaken identity.

If not, next thing to do will be to run pmcd via valgrind
with same (-f) args, retry the fetch test, then see what
valgrind reports when it fails.

cheers.

-- 
Nathan
<Prev in Thread] Current Thread [Next in Thread>