[pcp] pcp updates: pmdaproc, cgroups, books
Frank Ch. Eigler
fche at redhat.com
Tue Nov 18 11:52:54 CST 2014
> [...]
> Nathan Scott (2):
> pmda proc: rework existing per-cgroup metrics, and add new ones
> [...]
Nice work! Tried it out; some observations:
#1 - After installation, and a couple
% pminfo -f cgroups # and/or cgroup.cpuacct.usage_percpu
iterations, got suddenly got a persistent:
cgroup.cpuacct.usage
inst [0 or "/"] value 258190128361866
inst [1 or "/docker"] value 32018287616
inst [2 or "/system"] value 4970946920425
pmNameIndom: indom=3.21 inst=3: Unknown or illegal instance identifier
inst [3] value 229040599
pmNameIndom: indom=3.21 inst=4: Unknown or illegal instance identifier
inst [4] value 53662058907
pmNameIndom: indom=3.21 inst=5: Unknown or illegal instance identifier
inst [5] value 18446744073709551614
pmNameIndom: indom=3.21 inst=6: Unknown or illegal instance identifier
[...]
cgroup.blkio.dev.io_queued.async
No value(s) available!
cgroup.blkio.dev.io_queued.total
No value(s) available!
[...]
In /var/log/pcp/pmcd/pmcd.log, possibly related:
[Tue Nov 18 12:01:57] pmcd(1840) Error: ClientLoop: error sending Conn ACK PDU to new client IPC protocol failure
#2. After a restart, tried to trigger the same problem again. This time,
results simply stopped changing, and a copy of strace watching for open syscalls
told the tale:
open("/proc/diskstats", O_RDONLY) = 5
open("/sys/fs/cgroup/blkio/blkio.io_merged", O_RDONLY) = -1 EMFILE (Too many open files)
open("/sys/fs/cgroup/blkio/blkio.io_queued", O_RDONLY) = -1 EMFILE (Too many open files)
open("/sys/fs/cgroup/blkio/blkio.io_service_bytes", O_RDONLY) = -1 EMFILE (Too many open files)
open("/sys/fs/cgroup/blkio/blkio.io_serviced", O_RDONLY) = -1 EMFILE (Too many open files)
open("/sys/fs/cgroup/blkio/blkio.io_service_time", O_RDONLY) = -1 EMFILE (Too many open files)
open("/sys/fs/cgroup/blkio/blkio.io_wait_time", O_RDONLY) = -1 EMFILE (Too many open files)
open("/sys/fs/cgroup/blkio/blkio.sectors", O_RDONLY) = -1 EMFILE (Too many open files)
open("/sys/fs/cgroup/blkio/blkio.time", O_RDONLY) = -1 EMFILE (Too many open files)
open("/sys/fs/cgroup/blkio/docker/blkio.io_merged", O_RDONLY) = -1 EMFILE (Too many open files)
open("/sys/fs/cgroup/blkio/docker/blkio.io_queued", O_RDONLY) = -1 EMFILE (Too many open files)
And indeed lsof shows many open files, in this case
/sys/fs/cgroup/.../usage_percpu. Of all the cgroup.* metrics, that seems to
be the only fd leaker, so this trivial fix may be enough:
--- a/src/pmdas/linux_proc/cgroups.c
+++ b/src/pmdas/linux_proc/cgroups.c
@@ -458,8 +458,10 @@ read_percpuacct_usage(const char *file, const char *name)
if ((fp = fopen(file, "r")) == NULL)
return -ENOENT;
p = fgets(buffer, sizeof(buffer), fp);
- if (!p)
+ if (!p) {
+ fclose(fp);
return -ENOMEM;
+ }
for (cpu = 0; ; cpu++) {
value = strtoull(p, &endp, 0);
@@ -480,6 +482,7 @@ read_percpuacct_usage(const char *file, const char *name)
percpuacct->usage = value;
pmdaCacheStore(indom, PMDA_CACHE_ADD, inst, percpuacct);
}
+ fclose(fp);
return 0;
}
#3. From looking more at strace and the code, it looks as though the pmda might
be doing too much work per unit fetch. For example, again for fetching only
the cgroup.cpuacct.usage_percpu metric, strace shows:
open("/sys/fs/cgroup/cpu,cpuacct//system/lvm2-lvmetad.service/cpuacct.stat", O_RDONLY) = 185
open("/sys/fs/cgroup/cpu,cpuacct//system/lvm2-lvmetad.service/cpuacct.usage", O_RDONLY) = 185
open("/sys/fs/cgroup/cpu,cpuacct//system/lvm2-lvmetad.service/cpuacct.usage_percpu", O_RDONLY) = 185
i.e., data for unrelated metrics is being gathered every time. From looking
at the code, this seems to occur in other groups of metrics too.
This is kind of like <http://oss.sgi.com/bugzilla/show_bug.cgi?id=1067>.
#4. Instance domain unawareness. A variant of #3, when a pcp client is looking for
info on just one cgroup, the pmda fetches them all anyway:
% pminfo -f cgroup.cpuacct.usage
[... pick one indom ...]
% strace -eopen -f -p `pgrep pmdaproc` &
% pmval -i /system/pmcd.service cgroup.cpuacct.usage
[...]
open("/sys/fs/cgroup/cpu,cpuacct//system/plymouth-start.service/cpuacct.stat", O_RDONLY) = 969
open("/sys/fs/cgroup/cpu,cpuacct//system/plymouth-start.service/cpuacct.usage", O_RDONLY) = 969
open("/sys/fs/cgroup/cpu,cpuacct//system/plymouth-start.service/cpuacct.usage_percpu", O_RDONLY) = 969
open("/sys/fs/cgroup/cpu,cpuacct//system/dracut-initqueue.service/cpuacct.stat", O_RDONLY) = 970
open("/sys/fs/cgroup/cpu,cpuacct//system/dracut-initqueue.service/cpuacct.usage", O_RDONLY) = 970
open("/sys/fs/cgroup/cpu,cpuacct//system/dracut-initqueue.service/cpuacct.usage_percpu", O_RDONLY) = 970
[...]
#5. Multiple metrics unawareness (so to speak). A variant of #3,
where a pcp client is looking for info on two metrics (without
indom restrictions). strace indicates pmda effort is doubled:
% strace -eopen -f -p `pgrep pmdaproc` &
% pminfo -f cgroup.cpuacct.usage cgroup.blkio.all.sectors
[...]
open("/proc/cgroups", O_RDONLY) = 5
open("/proc/mounts", O_RDONLY) = 5
open("/proc/stat", O_RDONLY) = 5
open("/sys/fs/cgroup/cpu,cpuacct/cpuacct.stat", O_RDONLY) = 239
open("/sys/fs/cgroup/cpu,cpuacct/cpuacct.usage", O_RDONLY) = 239
open("/sys/fs/cgroup/cpu,cpuacct/cpuacct.usage_percpu", O_RDONLY) = 239
[...]
open("/proc/diskstats", O_RDONLY) = 5
open("/sys/fs/cgroup/blkio/blkio.io_merged", O_RDONLY) = 285
open("/sys/fs/cgroup/blkio/blkio.io_queued", O_RDONLY) = 285
open("/sys/fs/cgroup/blkio/blkio.io_service_bytes", O_RDONLY) = 285
[...]
open("/sys/fs/cgroup/blkio/docker/blkio.time", O_RDONLY) = 285
open("/proc/cgroups", O_RDONLY) = 5
open("/proc/mounts", O_RDONLY) = 5
open("/proc/stat", O_RDONLY) = 5
open("/sys/fs/cgroup/cpu,cpuacct/cpuacct.stat", O_RDONLY) = 285
open("/sys/fs/cgroup/cpu,cpuacct/cpuacct.usage", O_RDONLY) = 285
open("/sys/fs/cgroup/cpu,cpuacct/cpuacct.usage_percpu", O_RDONLY) = 285
open("/sys/fs/cgroup/cpu,cpuacct/docker/cpuacct.stat", O_RDONLY) = 329
open("/sys/fs/cgroup/cpu,cpuacct/docker/cpuacct.usage", O_RDONLY) = 329
[...]
open("/proc/cgroups", O_RDONLY) = 5
open("/proc/mounts", O_RDONLY) = 5
open("/proc/diskstats", O_RDONLY) = 5
open("/sys/fs/cgroup/blkio/blkio.io_merged", O_RDONLY) = 331
open("/sys/fs/cgroup/blkio/blkio.io_queued", O_RDONLY) = 331
open("/sys/fs/cgroup/blkio/blkio.io_service_bytes", O_RDONLY) = 331
[...]
Note the above traffic all came from one pminfo run, not two.
(Some of these may be cases of false confidence based on testing
primarily against the pcpqa artificial system files as in
linux/cgroups-root*.tgz. It's clever & useful, but cannot be as
thorough.)
- FChE
More information about the pcp
mailing list