| To: | Nathan Scott <nathans@xxxxxxxxxx> |
|---|---|
| Subject: | Re: [pcp] pmlogger_check stuck if host is down |
| From: | Rares Vernica <rvernica@xxxxxxxxx> |
| Date: | Mon, 13 Mar 2017 13:29:15 -0700 |
| Cc: | pcp@xxxxxxxxxxx |
| Delivered-to: | pcp@xxxxxxxxxxx |
| Dkim-signature: | v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=e/mBPb+rIGbLUd1FjIxeCRPnhxf4klXifK1GQXxArrA=; b=ELGqOuc+UeiWfrVtjC5eBHLjCur59lPbeSr+0n9/UP8uf9TIswD/8iG+i3kPc555JM ifg7kexFlQAozfWo7p+Za7M6CW+CSDWg60sUOWw7dOSGgFLZXXS4GnxN9OOGwzrzwZEm RkHhTWp+vKy18RV+rHfn2L2lPEV5HO9VQkDUfYNIUcgBOX1JpTq3uvGXDHExYbJ7IZ8w lYFq0HL2jScut+FngF2Lt0ocywtyj5M8leZBnJ60C0xZb41khK1Q9w7svcIOWiCih/w5 KmJwF0sgH1BEv8SenoWNk63AmWliuoE3d41645SDYtQV9DTcMkxoJVFcP+NZp5g2XUBN mbdA== |
| In-reply-to: | <1798358557.43980562.1461899036387.JavaMail.zimbra@xxxxxxxxxx> |
| References: | <CALQ9KxCa75FNi0RY7rfSrQjJh=L33mPQWZpQpgGy2quPE+cimQ@xxxxxxxxxxxxxx> <1798358557.43980562.1461899036387.JavaMail.zimbra@xxxxxxxxxx> |
|
Hi Nathan,
Thanks for your pointers. I looked more into this. > ----- Original Message -----> > [...] > > If one of the remote hosts is down, pmlogger_check gets stuck on that host > > and takes about 30 min to move on. I ran pmlogger_check with -VV and the > > output looks like: > > > > [...] > > > ps ax | grep pml > > (any pmprobe processes running OOC? Âthat grep would have excluded 'em, but > I wonder if thats where the blockage is) While pmllogger_check was being stuck on the host which is down, I run ps to check for pmprobe. I run ps every 3-4 seconds and here is the output: # ps ax | grep bb-02 19974 pts/0 S+ 0:00 /bin/sh /usr/libexec/pcp/bin/pmlogconf -r -c -q -h bb-02 /tmp/pcp.SH9Zh8Psb/pmlogger 26891 pts/0 S+ 0:00 /bin/sh /usr/libexec/pcp/bin/pmlogconf-setup -h bb-02 /var/lib/pcp/config/pmlogconf/sgi/numa 26903 pts/0 S+ 0:00 /bin/sh /usr/libexec/pcp/bin/pmlogconf-setup -h bb-02 /var/lib/pcp/config/pmlogconf/sgi/numa 26906 pts/0 S+ 0:00 pmprobe -h bb-02 -v origin.numa.routerload # ps ax | grep bb-02 19974 pts/0 S+ 0:00 /bin/sh /usr/libexec/pcp/bin/pmlogconf -r -c -q -h bb-02 /tmp/pcp.SH9Zh8Psb/pmlogger 26921 pts/0 S+ 0:00 /bin/sh /usr/libexec/pcp/bin/pmlogconf-setup -h bb-02 /var/lib/pcp/config/pmlogconf/sgi/numa-summary 26933 pts/0 S+ 0:00 /bin/sh /usr/libexec/pcp/bin/pmlogconf-setup -h bb-02 /var/lib/pcp/config/pmlogconf/sgi/numa-summary 26936 pts/0 S+ 0:00 pmprobe -h bb-02 -v origin.numa.migr.intr.total # ps ax | grep bb-02 19974 pts/0 S+ 0:00 /bin/sh /usr/libexec/pcp/bin/pmlogconf -r -c -q -h bb-02 /tmp/pcp.SH9Zh8Psb/pmlogger 26951 pts/0 S+ 0:00 /bin/sh /usr/libexec/pcp/bin/pmlogconf-setup -h bb-02 /var/lib/pcp/config/pmlogconf/sgi/xbow 26963 pts/0 S+ 0:00 /bin/sh /usr/libexec/pcp/bin/pmlogconf-setup -h bb-02 /var/lib/pcp/config/pmlogconf/sgi/xbow 26966 pts/0 S+ 0:00 pmprobe -h bb-02 -v xbow.nports # ps ax | grep bb-02 19974 pts/0 S+ 0:00 /bin/sh /usr/libexec/pcp/bin/pmlogconf -r -c -q -h bb-02 /tmp/pcp.SH9Zh8Psb/pmlogger 26951 pts/0 S+ 0:00 /bin/sh /usr/libexec/pcp/bin/pmlogconf-setup -h bb-02 /var/lib/pcp/config/pmlogconf/sgi/xbow 26963 pts/0 S+ 0:00 /bin/sh /usr/libexec/pcp/bin/pmlogconf-setup -h bb-02 /var/lib/pcp/config/pmlogconf/sgi/xbow 26966 pts/0 S+ 0:00 pmprobe -h bb-02 -v xbow.nports So, I can see that pmlogconf is making progress, but it is very slow. It takes more than 8 minutes to go through the groups of metrics for only one host. Is there a way to short-circuit it if the host is down and the remote pmcd did not respond after the first metric? Thanks! Rares |
| <Prev in Thread] | Current Thread | [Next in Thread> |
|---|---|---|
| ||
| Previous by Date: | Juniper Networks Users / Customers List, Sam Anderson |
|---|---|
| Next by Date: | ÐÐÐÐÑÐÑÑÑÐÐ ÑÑÐÐÑÐ, Legal assistance |
| Previous by Thread: | Juniper Networks Users / Customers List, Sam Anderson |
| Next by Thread: | ÐÐÐÐÑÐÑÑÑÐÐ ÑÑÐÐÑÐ, Legal assistance |
| Indexes: | [Date] [Thread] [Top] [All Lists] |