pcp
[Top] [All Lists]

Re: weird error

To: Alan Bailey <abailey@xxxxxxxxxxxxx>
Subject: Re: weird error
From: Ken McDonell <kenmcd@xxxxxxxxxxxxxxxxx>
Date: Fri, 1 Dec 2000 09:33:48 +1100
Cc: pcp@xxxxxxxxxxx
In-reply-to: <Pine.LNX.4.10.10011301610510.9609-100000@osage.ncsa.uiuc.edu>
Reply-to: kenmcd@xxxxxxx
Sender: owner-pcp@xxxxxxxxxxx
OK, you are faster than I am!

The problem is clearly reproducible ... and the diagnosis is below

On Thu, 30 Nov 2000, Alan Bailey wrote:

> Thanks for the debugging information, it's helpful.  I forgot to mention
> in the first email that the simple pmda works just fine as a daemon.

Yes, I've discovered this too and there's a strong hint there.

> Without further ado, here are three pminfo commands with the -D
> profile,pdu options.  The first is just for simple.numfetch, the second is
> for simple.color, and the third is for simple.color and mem.freemem:
> 
> ...

[OK stuff deleted]

> lanner % pminfo -f -D profile,pdu simple.color mem.freemem
> ...
> [3305]pmXmitPDU: FETCH fd=3 len=36
> 000:       24     7003      ce9        0        0        0  2000000  100403f 
> 008:  a04000f 

Note 0x100403f is the PMID for simple.color and 0xa04000f is the PMID for
mem.freemem being sent from pminfo to pmcd.

> [3305]pmGetPDU: RESULT fd=3 len=96 from=3285 moreinput? no
> 000:       60     7001      cd5 e2d1263a caea0c00  2000000  100403f  3000000 
> 008:        0        0  4000000  1000000 68000000  2000000 cc000000  100403f 
> 016:  3000000        0        0  4000000  1000000 68000000  2000000 cc000000 
> pmResult dump from 0x804e168 timestamp: 975622626.846538 16:17:06.846
> numpmid: 2  253.0.1 (simple.color): numval: 3 valfmt: 0 vlist[]:
>     inst [0 or "red"] value 4
>     inst [1 or "green"] value 104
>     inst [2 or "blue"] value 204
>   253.0.1 (simple.color): numval: 3 valfmt: 0 vlist[]:
>     inst [0 or "red"] value 4
>     inst [1 or "green"] value 104
>     inst [2 or "blue"] value 204

When the answer comes back, PMID 0xa04000f has vanished and 0x100403f
appears twice ... this is BOGUS!

> ...
> 
> mem.freemem
>     value 4
>     value 104
>     value 204

pminfo uses the name of the second metric (mem.freemem) on the reasonable
assumption that the PMID should match.

The problem is that mem.* and simple.* are in two different DSO agents,
which is why making simple a daemon makes the problem go away.

Expect a fix real soon.


<Prev in Thread] Current Thread [Next in Thread>