pcp
[Top] [All Lists]

Re: [pcp] netstat output from proc pmda

To: Martins Innus <minnus@xxxxxxxxxxx>
Subject: Re: [pcp] netstat output from proc pmda
From: Nathan Scott <nathans@xxxxxxxxxx>
Date: Mon, 5 Oct 2015 19:38:35 -0400 (EDT)
Cc: pcp developers <pcp@xxxxxxxxxxx>, Jamie Bainbridge <jbainbri@xxxxxxxxxx>
Delivered-to: pcp@xxxxxxxxxxx
In-reply-to: <560EB3BD.50100@xxxxxxxxxxx>
References: <560EB3BD.50100@xxxxxxxxxxx>
Reply-to: Nathan Scott <nathans@xxxxxxxxxx>
Thread-index: pTUZVAs8PhvkCxjg8vHIrUBF1jb+DA==
Thread-topic: netstat output from proc pmda
Hi Martins,

[CC'ing Jamie who was been hacking on a pcp netstat/nicstat recently]

----- Original Message -----
> Hi,
> 
>      I have a need to get netstat type output from pcp. We have a
> certain class of "runaway" processes that we'd like to monitor. The
> scenario is as follows.  One of the machines get sluggish, we login and do:
> 
> [vagrant@centos7 ]$ sudo netstat -auntpe
> Active Internet connections (servers and established)
> Proto Recv-Q Send-Q Local Address           Foreign Address State
> User       Inode      PID/Program name
> tcp        0      0 127.0.0.1:25            0.0.0.0:* LISTEN
> 0          17509      1606/master
> tcp        0      0 0.0.0.0:111             0.0.0.0:* LISTEN
> 0          16390      958/rpcbind
> tcp        0      0 0.0.0.0:48368           0.0.0.0:* LISTEN
> 29         16969      1205/rpc.statd
> tcp        0      0 0.0.0.0:22              0.0.0.0:* LISTEN
> 0          16790      1068/sshd
> .............
> 
> 
> and find that the Recv-Q and Send-Q fields are high for some process, we
> restart the process and all is well again.  I'd love to be able to setup
> pmie rules to monitor this type of thing and take action.
> 
> Its a larger problem that needs to be fixed, but for now I just need a
> good way to deal with this broken software.  Right now I have rules for
> high cpu load based on process name but this is not always a good
> indicator since the process sometimes can use cpu and be perfectly
> fine.  The only accurate measure of a problem seems to be those Recv-Q
> and Send-Q items.
> 
> The quickest thing to do would be to write a pmdanetstat, but I'm not
> sure if this should live in the proc pmda instead since all the
> information comes from the union of /proc/net/tcp with /proc/<pid>/fd.
> Then you could just look for a high value for some instance of:
> 
> proc.fd.socket.recvq
> 
> and all is well.  But this is a case of needing a multidimensional
> instance.  In this case either "pid,fd" or "pid,socketnum".  Since a
> process can have any number of sockets open.  Any suggestions on a way
> to organize this better if the thought is that it should be part of the
> proc pmda?  Maybe its not worth it for the relatively small number of
> processes that would have these metrics?  If there was an elegant way to
> do this, there might be other metrics that could be added to the
> "pid,fd" indom.
> 

I think you're on a good path there - in pmdaproc we have already some
compound instances, like cgroup.blkio.dev which uses a composite of the
"cgroup::device" as external identifier (and with internal identifiers
managed by pmdaCache routines).  So I'd recommend keeping it in pmdaproc
where the /proc/[pid]/ iteration code lives.

cheers.

--
Nathan

<Prev in Thread] Current Thread [Next in Thread>