pcp
[Top] [All Lists]

hotproc pmda failing in qa/800

To: PCP <pcp@xxxxxxxxxxx>
Subject: hotproc pmda failing in qa/800
From: Ken McDonell <kenj@xxxxxxxxxxxxxxxx>
Date: Wed, 24 Dec 2014 20:02:20 +1100
Delivered-to: pcp@xxxxxxxxxxx
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.3.0
Cut these dbpmda snippets from qa/800

kenj@vm24:~/src/pcp/qa$ sudo dbpmda
.dbpmdarc> open pipe /var/lib/pcp/pmdas/proc/pmdaproc -d 3
Start pmdaproc PMDA: /var/lib/pcp/pmdas/proc/pmdaproc -d 3
.dbpmdarc> getdesc on
.dbpmdarc> attr "username" "root"
Attribute: username=root
Success
.dbpmdarc> attr 11 "0"
Attribute: userid=0
Success
.dbpmdarc> 
dbpmda> wait 11
dbpmda> fetch hotproc.nprocs
PMID(s): 3.52.99
pmResult dump from 0x8f38700 timestamp: 0.000000 10:00:00.000 numpmid: 1
  3.52.99 (hotproc.nprocs): numval: 1 valfmt: 0 vlist[]:
   value 0
dbpmda> fetch hotproc.control.config
PMID(s): 3.60.8
pmResult dump from 0x8f38700 timestamp: 0.000000 10:00:00.000 numpmid: 1
  3.60.8 (hotproc.control.config): numval: 1 valfmt: 1 vlist[]:
   value ""

Down to here we're good ... then do this store

dbpmda> store hotproc.control.config 'fname=="pmdaproc"'
PMID: 3.60.8
Getting description...
Getting Result Structure...
3.60.8: "" -> "fname=="pmdaproc""
Sending Result...
dbpmda> 

Which returns (note the dbpmda> prompt) ... then O(10) seconds later, kaboom

*** Error in `/var/lib/pcp/pmdas/proc/pmdaproc': free(): invalid next size 
(fast): 0x09e6c868 ***
======= Backtrace: =========
/lib/libc.so.6(+0x6dfd3)[0xb754bfd3]
/lib/libc.so.6(+0x7418a)[0xb755218a]
/lib/libc.so.6(+0x74dcc)[0xb7552dcc]
/var/lib/pcp/pmdas/proc/pmdaproc[0x8054604]
/usr/lib/libpcp.so.3(+0x36fdf)[0xb76c5fdf]
linux-gate.so.1(__kernel_sigreturn+0x0)[0xb7732400]
linux-gate.so.1(__kernel_vsyscall+0xe)[0xb7732422]
/lib/libc.so.6(__read+0x23)[0xb75ba863]
/usr/lib/libpcp.so.3(+0x143b8)[0xb76a33b8]
/usr/lib/libpcp.so.3(__pmGetPDU+0x82)[0xb76a3df2]
/usr/lib/libpcp_pmda.so.3(__pmdaMainPDU+0xa8)[0xb7706328]
/usr/lib/libpcp_pmda.so.3(pmdaMain+0x28)[0xb7706f28]
/var/lib/pcp/pmdas/proc/pmdaproc[0x804a224]
/lib/libc.so.6(__libc_start_main+0xf3)[0xb74f79d3]
/var/lib/pcp/pmdas/proc/pmdaproc[0x804a271]
======= Memory map: ========
08048000-08064000 r-xp 00000000 08:02 84314      
/var/lib/pcp/pmdas/proc/pmdaproc
08064000-08065000 r--p 0001b000 08:02 84314      
/var/lib/pcp/pmdas/proc/pmdaproc
08065000-08068000 rw-p 0001c000 08:02 84314      
/var/lib/pcp/pmdas/proc/pmdaproc
08068000-08069000 rw-p 00000000 00:00 0 
09e56000-09e77000 rw-p 00000000 00:00 0          [heap]
b73d3000-b73ee000 r-xp 00000000 08:02 4856       /lib/libgcc_s.so.1
b73ee000-b73ef000 r--p 0001a000 08:02 4856       /lib/libgcc_s.so.1
b73ef000-b73f0000 rw-p 0001b000 08:02 4856       /lib/libgcc_s.so.1
b740a000-b743f000 r--s 00000000 00:10 7329       /var/run/nscd/group
b743f000-b7474000 r--s 00000000 00:10 7328       /var/run/nscd/passwd
b7474000-b7475000 rw-p 00000000 00:00 0 
b7475000-b748d000 r-xp 00000000 08:02 229        /lib/libpthread-2.18.so
b748d000-b748e000 ---p 00018000 08:02 229        /lib/libpthread-2.18.so
b748e000-b748f000 r--p 00018000 08:02 229        /lib/libpthread-2.18.so
b748f000-b7490000 rw-p 00019000 08:02 229        /lib/libpthread-2.18.so
b7490000-b7492000 rw-p 00000000 00:00 0 
b7492000-b74d6000 r-xp 00000000 08:02 215        /lib/libm-2.18.so
b74d6000-b74d7000 r--p 00043000 08:02 215        /lib/libm-2.18.so
b74d7000-b74d8000 rw-p 00044000 08:02 215        /lib/libm-2.18.so
b74d8000-b74d9000 rw-p 00000000 00:00 0 
b74d9000-b74dc000 r-xp 00000000 08:02 28         /lib/libdl-2.18.so
b74dc000-b74dd000 r--p 00002000 08:02 28         /lib/libdl-2.18.so
b74dd000-b74de000 rw-p 00003000 08:02 28         /lib/libdl-2.18.so
b74de000-b7689000 r-xp 00000000 08:02 196        /lib/libc-2.18.so
b7689000-b768b000 r--p 001ab000 08:02 196        /lib/libc-2.18.so
b768b000-b768c000 rw-p 001ad000 08:02 196        /lib/libc-2.18.so
b768c000-b768f000 rw-p 00000000 00:00 0 
b768f000-b76f8000 r-xp 00000000 08:02 82175      /usr/lib/libpcp.so.3
b76f8000-b76fa000 r--p 00069000 08:02 82175      /usr/lib/libpcp.so.3
b76fa000-b76fb000 rw-p 0006b000 08:02 82175      /usr/lib/libpcp.so.3
b76fb000-b76ff000 rw-p 00000000 00:00 0 
b76ff000-b7713000 r-xp 00000000 08:02 82332      /usr/lib/libpcp_pmda.so.3
b7713000-b7714000 r--p 00013000 08:02 82332      /usr/lib/libpcp_pmda.so.3
b7714000-b7715000 rw-p 00014000 08:02 82332      /usr/lib/libpcp_pmda.so.3
b7715000-b7717000 rw-p 00000000 00:00 0 
b772d000-b772e000 rw-p 00000000 00:00 0 
b772e000-b772f000 r--s 00000000 08:02 82549      
/var/lib/pcp/pmdas/proc/help.pag
b772f000-b7730000 r--s 00000000 08:02 82535      
/var/lib/pcp/pmdas/proc/help.dir
b7730000-b7732000 rw-p 00000000 00:00 0 
b7732000-b7733000 r-xp 00000000 00:00 0          [vdso]
b7733000-b7754000 r-xp 00000000 08:02 11929      /lib/ld-2.18.so
b7754000-b7755000 r--p 00020000 08:02 11929      /lib/ld-2.18.so
b7755000-b7756000 rw-p 00021000 08:02 11929      /lib/ld-2.18.so
bf859000-bf87a000 rw-p 00000000 00:00 0          [stack]

I'd guess the store is trashing the heap which does not bite until the next 
update cycle.

Running the pmda within valgrind within dbpmda gives a bit more info, but no 
insight to the real cause I suspect.

dbpmda> ==12161== Invalid write of size 1
==12161==    at 0x402E8D0: memcpy (in 
/usr/lib/valgrind/vgpreload_memcheck-x86-linux.so)
==12161==    by 0x80545CF: hotproc_eval_procs (string3.h:104)
==12161==    by 0x409DFDE: onalarm (AF.c:272)
==12161==  Address 0x4323fc8 is 0 bytes after a block of size 8 alloc'd
==12161==    at 0x4029EAD: malloc (in 
/usr/lib/valgrind/vgpreload_memcheck-x86-linux.so)
==12161==    by 0x80545B2: hotproc_eval_procs (proc_pid.c:640)
==12161==    by 0x409DFDE: onalarm (AF.c:272)
==12161== 

Sorry, that's all I can extract ... attaching gdb to the pmdaproc process did 
not help (suspect it is off in some pthread).

But it is 100% reproducible here ...
vm24        3.10.1   i686    openSUSE 13.1 (Bottle)
and the same QA test is failing on 5 other hosts (although I've not triaged the 
failures there).

Ahh, but wait ... inspection of the proc PMDA source file proc_pid.c at line 
640 exposes a classic off-by-one error.
Fix that (commit coming later) and qa/800 passes on vm24.

<Prev in Thread] Current Thread [Next in Thread>
  • hotproc pmda failing in qa/800, Ken McDonell <=