pcp
[Top] [All Lists]

Re: [pcp] pcp updates: containers, qa

To: Nathan Scott <nathans@xxxxxxxxxx>, PCP <pcp@xxxxxxxxxxx>
Subject: Re: [pcp] pcp updates: containers, qa
From: Mark Goodwin <mgoodwin@xxxxxxxxxx>
Date: Wed, 13 May 2015 12:17:12 +1000
Delivered-to: pcp@xxxxxxxxxxx
In-reply-to: <1009828124.17466209.1431416607069.JavaMail.zimbra@xxxxxxxxxx>
References: <1009828124.17466209.1431416607069.JavaMail.zimbra@xxxxxxxxxx>
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.3.0
On 05/12/2015 05:43 PM, Nathan Scott wrote:
Changes committed to git://git.pcp.io/nathans/pcp.git master

Nathan Scott (3):
       pmdalinux: fix container issues, especially with networking metrics
       qa: fix typo in an error message in test 540
       build: extend gitignore file set for pmdaroot

The code all looks OK to me, but testing failed - pmdaroot segfault
with a NULL cp->name in the fetch callback. This is with a script running
that is continually creating busybox containers that basically just run
ifconfig -a and then exit.

Also noticed containers.state.running was showing way too many instances.
Should be zero or at most one ("docker ps" shows none running).

To repro, run the following :

# while true; do docker run busybox ifconfig eth0; done

Then in another window, run "pminfo -f containers" a few times.

I'll try and find some time to figure this out later today, but
for now here's a gdb traceback :

Program received signal SIGSEGV, Segmentation fault.
0x000000000040315b in root_fetchCallBack (mdesc=0x607360 <root_metrictab+32>, inst=<optimized out>, atom=0x7ffcc9973f60) at root.c:202
202                 atom->cp = *cp->name == '/' ? cp->name+1 : cp->name;
(gdb) where
#0 0x000000000040315b in root_fetchCallBack (mdesc=0x607360 <root_metrictab+32>, inst=<optimized out>, atom=0x7ffcc9973f60) at root.c:202 #1 0x00007fad98c6c7b4 in pmdaFetch (numpmid=<optimized out>, pmidlist=<optimized out>, resp=<optimized out>, pmda=0x1913010)
    at callback.c:573
#2 0x00007fad98c6ef22 in __pmdaMainPDU (dispatch=dispatch@entry=0x7ffcc9974160) at mainloop.c:179
#3  0x0000000000402457 in root_main (dp=0x7ffcc9974160) at root.c:682
#4  main (argc=<optimized out>, argv=<optimized out>) at root.c:767
(gdb) l 195
190             containers = INDOM(CONTAINERS_INDOM);
191             sts = pmdaCacheLookup(containers, inst, &name, (void**)&cp);
192             if (sts < 0)
193                 return sts;
194             if (sts != PMDA_CACHE_ACTIVE)
195                 return PM_ERR_INST;
196             root_refresh_container_values(name, cp);
197             switch (idp->item) {
198             case 0:         /* containers.engine */
199                 atom->cp = cp->engine->name;
(gdb) l
200                 break;
201             case 1:         /* containers.name */
202                 atom->cp = *cp->name == '/' ? cp->name+1 : cp->name;
203                 break;
204             case 2:         /* containers.pid */
205                 atom->ul = cp->pid;
206                 break;
207             case 3:         /* containers.state.running */
208                 atom->ul = (cp->status & CONTAINER_FLAG_RUNNING) != 0;
209                 break;
(gdb) p cp
$1 = (container_t *) 0x1914490
(gdb) p *cp
$2 = {pid = 0, status = 0, name = 0x0,
cgroup = "system.slice/docker-e720313565c7817bb4aa1c287ef0908d6960b20c7ec36f9d1a50a069a03a5b6b.scope", '\000' <repeats 37 times>, stat = {st_dev = 0, st_ino = 0, st_nlink = 0, st_mode = 0, st_uid = 0, st_gid = 0, __pad0 = 0, st_rdev = 0, st_size = 0, st_blksize = 0, st_blocks = 0, st_atim = {tv_sec = 0, tv_nsec = 0}, st_mtim = {tv_sec = 0, tv_nsec = 0}, st_ctim = {tv_sec = 0, tv_nsec = 0},
    __glibc_reserved = {0, 0, 0}}, engine = 0x6074c0 <engines>}

so we segfaulted dereferencing cp->name

<Prev in Thread] Current Thread [Next in Thread>