Each and every pcp client fetch to a container-aware metric involves IPC and
context-switching to-from pmdaroot. For example, during the operation of bug
#1109, an strace over pmdaroot shows:
recvfrom(6, "\1\220\0\0\32\0\0\0\0\0\0\0\1\0\0\0\0\0\0\0\2\0\0\0000\0", 8192,
0, NULL, NULL) = 26
stat("/var/lib/docker/containers", {st_mode=S_IFDIR|0700, st_size=20480, ...})
= 0
stat("/var/lib/lxc", 0x7ffd4bab1310) = -1 ENOENT (No such file or directory)
stat("/var/lib/docker/containers/1a478e901ca8e98da2d02060c89a480a6d016ba33d30a3947004ae2538892049/config.json",
{st_mode=S_IFREG|0644, st_size=2034, ...}) = 0
stat("/var/lib/docker/containers/113bb058e2c31e009230cb7b381182384794f8115b6e1ce9a9dc5a06ac6f63c9/config.json",
{st_mode=S_IFREG|0644, st_size=2033, ...}) = 0
stat("/var/lib/docker/containers/d2e561fba8c9fd3ce0234dba1cf97c6af883656217ebb0c297669cffec19c2cd/config.json",
{st_mode=S_IFREG|0644, st_size=2041, ...}) = 0
[.... repeated for each container, dozens or hundreds of times ...]
sendto(6, "\2\220\0\0\30\0\0\0\221\317\377\377\1\0\0\0\0\0\0\0\0\0\0\0", 24, 0,
NULL, 0) = 24
select(9, [0 3 6 7 8], NULL, NULL, NULL^CProcess 30267 detached
for a single query from pmval (or pminfo), even if there is no lifespan
change to the set of containers.
This is a failure to scale in several ways:
- with many containers running, the stat(3)s alone start consuming serious time
even for a single client
- with many clients running, the effect multiplies: pmdaroot becomes a point of
contention (creating extra latency)
One can see what happens with something like pmlogconf (dozens of short-lived
pcp clients) running against each of a set of containers: geometric explosion
in terms of cpu & time consumption.
1 query per second over each of 50 containers' pmcd.hostname metrics is enough
to take >10% system CPU in pmdaroot alone.
Worker pmdas should not need to communicate with pmdaroot after a container
name is resolved at connection time.