Hi, Ken -
Thanks a lot for the trace.
> kenj@bozo:~/src/pcp/qa$ ps -ef | grep '[p]mie.*pmmgr'
> pcp 7698 29916 0 08:21 pts/1 00:00:00 /usr/bin/pmie -c
> /var/log/pcp/pmmgr/bozo/config.pmie -h local: -f -l
> /var/log/pcp/pmmgr/bozo/pmie.log
> pcp 14241 14776 0 08:37 pts/1 00:00:00 sh -c /usr/bin/pmie -c
> /var/log/pcp/pmmgr/bozo/config.pmie -h local: -f -l
> /var/log/pcp/pmmgr/bozo/pmie.log
> pcp 14243 14241 0 08:37 pts/1 00:00:00 /usr/bin/pmie -c
> /var/log/pcp/pmmgr/bozo/config.pmie -h local: -f -l
> /var/log/pcp/pmmgr/bozo/pmie.log
> pcp 14794 29916 0 07:46 pts/1 00:00:00 /usr/bin/pmie -c
> /var/log/pcp/pmmgr/bozo/config.pmie -h local: -f -l
> /var/log/pcp/pmmgr/bozo/pmie.log
> pcp 32065 29916 0 08:20 pts/1 00:00:00 /usr/bin/pmie -c
> /var/log/pcp/pmmgr/bozo/config.pmie -h local: -f -l
> /var/log/pcp/pmmgr/bozo/pmie.log
Note that the pmmgr.log file comes from pmmgr pid 14776, which has
killed its child pmies/etc. multiple times. Just one (pid 14241/14243
from your list) is still alive, which is correct. The question is
where the others are from: who is pid 29916 and why is she still
running?
Oh, you later say it's "init"? Interesting (why not pid 1?); if so,
those old pmies must have come from a previous pmmgr process that has
gone away and didn't send out a SIGTERM memo. Was that perhaps a
pmmgr run during pcpqa? If so, please consider shutting down the
system pmmgr during test time, so we get a clean pmmgr.log etc. (It's
unfortunate that system processes conflict with pcpqa.)
> The word "sig" (ignoring case) does not appear to be in the pmmgr.log file.
(Sorry, "killed" is in there.)
- FChE
|