pcp
[Top] [All Lists]

URGENT potentially serious regression in 3.7.0

To: pcp@xxxxxxxxxxx
Subject: URGENT potentially serious regression in 3.7.0
From: Ken McDonell <kenj@xxxxxxxxxxxxxxxx>
Date: Sun, 10 Mar 2013 07:21:56 +1100
Delivered-to: pcp@xxxxxxxxxxx
User-agent: Mozilla/5.0 (X11; Linux i686; rv:17.0) Gecko/20130221 Thunderbird/17.0.3
I had suspected, without any proof that PCP QA was running much slower.

I started to look at 169 failing with a error return of "Timeout waiting for a response from PMCD" rather than "IPC protocol failure" which I thought was a minor issue, but is in fact a regression ... when pmcd times out on the pmda ipc, it used to (and should) send an ipc error response to the client waiting on the pmda response.

This no longer happens ... the pmda timeout happens, but the client is left hanging until its own timeout on the pmcd ipc goes off ... this is wrong.

But much more seriously, in the process of investigating this, I turned on all diags for pmcd and arrggghhh .... millions of line of output of the form

__pmDataIPC: fd=974
__pmDataIPC: fd=974, data=0xb84623e0(sz=8)

where fd increments from 0 to 1027 (or there abouts) and this repeats 56 times in the short life of pmcd for qa/169.

This looks like a problem with the fd's for client ipc moving up into the large 1024+ range and some sort of iteration over all possible fds.

This needs to be fixed before any 3.7.0 release is contemplated.

<Prev in Thread] Current Thread [Next in Thread>