pcp
[Top] [All Lists]

Re: [pcp] pcp updates: containers, qa

To: Nathan Scott <nathans@xxxxxxxxxx>
Subject: Re: [pcp] pcp updates: containers, qa
From: Mark Goodwin <mgoodwin@xxxxxxxxxx>
Date: Thu, 14 May 2015 21:49:20 +1000
Cc: PCP <pcp@xxxxxxxxxxx>
Delivered-to: pcp@xxxxxxxxxxx
In-reply-to: <1534828449.19453028.1431583297773.JavaMail.zimbra@xxxxxxxxxx>
References: <1009828124.17466209.1431416607069.JavaMail.zimbra@xxxxxxxxxx> <5552B428.2010604@xxxxxxxxxx> <1582823694.18245416.1431484897421.JavaMail.zimbra@xxxxxxxxxx> <1662044263.18277261.1431491325842.JavaMail.zimbra@xxxxxxxxxx> <5552E12D.5020807@xxxxxxxxxx> <2135868737.18350469.1431496017520.JavaMail.zimbra@xxxxxxxxxx> <55533BF8.208@xxxxxxxxxx> <1534828449.19453028.1431583297773.JavaMail.zimbra@xxxxxxxxxx>
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.3.0
On 05/14/2015 04:01 PM, Nathan Scott wrote:
Hi Mark,

----- Original Message -----
On 05/13/2015 03:46 PM, Nathan Scott wrote:

Sounds good - thanks mate.  If you don't get to it tonight, toss it back to
me and I'll write the test tomorrow.

Here's the flat profile of pmdaroot compiled with -pg after running
pminfo -f containers.state.running  on 14662 containers. Basically
pmdaroot is disk bound reading and parsing the json for each container
below /var/lib/docker/containers (total of ~ 467 mbytes). With default
parameters, the pmda times out. After fixing that, the client times out.
When you fix that, you wait for ~ 22 seconds or so ... which isn't
actually too bad given how much json data has to be parsed.

Can you try things with the attached patch?

didn't make any difference. My system now has 36000+ containers (I'm
determined to see what happens when docker creates the 65537th
container, thus running out class b/c addresses, so I'm leaving my
script running).

After restarting pmcd -t 0 and with PMCD_REQUEST_TIMEOUT disabled,
the first fetch of containers.state.running takes over 18m (read and
json parse all containers). The second and subsequent fetches take
~ 2s (just a stat() per container since none of them are running).

So we need a second thread for the initial scan, during which the
main thread just returns PM_ERR_PMDANOTREADY for all requests. In
reality, this many containers is a bit silly, so the patch for this
is on my back-burner - let's move on to more important container work!

Regards

<Prev in Thread] Current Thread [Next in Thread>