pcp
[Top] [All Lists]

netstat output from proc pmda

To: pcp developers <pcp@xxxxxxxxxxx>
Subject: netstat output from proc pmda
From: Martins Innus <minnus@xxxxxxxxxxx>
Date: Fri, 2 Oct 2015 12:41:33 -0400
Delivered-to: pcp@xxxxxxxxxxx
User-agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.10; rv:38.0) Gecko/20100101 Thunderbird/38.2.0
Hi,

I have a need to get netstat type output from pcp. We have a certain class of "runaway" processes that we'd like to monitor. The scenario is as follows. One of the machines get sluggish, we login and do:

[vagrant@centos7 ]$ sudo netstat -auntpe
Active Internet connections (servers and established)
Proto Recv-Q Send-Q Local Address Foreign Address State User Inode PID/Program name tcp 0 0 127.0.0.1:25 0.0.0.0:* LISTEN 0 17509 1606/master tcp 0 0 0.0.0.0:111 0.0.0.0:* LISTEN 0 16390 958/rpcbind tcp 0 0 0.0.0.0:48368 0.0.0.0:* LISTEN 29 16969 1205/rpc.statd tcp 0 0 0.0.0.0:22 0.0.0.0:* LISTEN 0 16790 1068/sshd
.............


and find that the Recv-Q and Send-Q fields are high for some process, we restart the process and all is well again. I'd love to be able to setup pmie rules to monitor this type of thing and take action.

Its a larger problem that needs to be fixed, but for now I just need a good way to deal with this broken software. Right now I have rules for high cpu load based on process name but this is not always a good indicator since the process sometimes can use cpu and be perfectly fine. The only accurate measure of a problem seems to be those Recv-Q and Send-Q items.

The quickest thing to do would be to write a pmdanetstat, but I'm not sure if this should live in the proc pmda instead since all the information comes from the union of /proc/net/tcp with /proc/<pid>/fd. Then you could just look for a high value for some instance of:

proc.fd.socket.recvq

and all is well. But this is a case of needing a multidimensional instance. In this case either "pid,fd" or "pid,socketnum". Since a process can have any number of sockets open. Any suggestions on a way to organize this better if the thought is that it should be part of the proc pmda? Maybe its not worth it for the relatively small number of processes that would have these metrics? If there was an elegant way to do this, there might be other metrics that could be added to the "pid,fd" indom.

Otherwise, I think it would be very straightforward to do as a new pmdanetstat where the instances are the inodes, since those should be unique. And I may just do that locally for now to get something up and running quickly.

Thanks

Martins

<Prev in Thread] Current Thread [Next in Thread>