pcp
[Top] [All Lists]

Re: [pcp] PCP Network Latency PMDA

To: pcp@xxxxxxxxxxx
Subject: Re: [pcp] PCP Network Latency PMDA
From: William Cohen <wcohen@xxxxxxxxxx>
Date: Tue, 24 Jun 2014 16:35:50 -0400
Delivered-to: pcp@xxxxxxxxxxx
In-reply-to: <53A34A47.3060008@xxxxxxxxxx>
References: <53A34A47.3060008@xxxxxxxxxx>
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101 Thunderbird/24.6.0
On 06/19/2014 04:38 PM, William Cohen wrote:
> I have been looking at a PMDA that provides information about how long
> it takes for packets to make their way from userspace to the network
> device and from network device to userspace.  People might be
> interested to know whether the network traffic has too much latency in
> the kernel.  The kernel tracepoints and the perf netdev-times script
> [+] that allow the user to determine how long it takes for network
> packets to make their way through the networking stacks.  However, it
> isn't appropriate using the netdev-times script for production
> systems. The script provides too much detail (information on every
> packet) and results in WAY too much overhead. It is not able to
> process significant network traffice in real time.
> 
> The same tracepoints are available to systemtap and a systemtap script
> could provide more appropriate summary style information to pcp as a
> pmda with much lower overhead.  The thought is that it would probably
> be sufficient to provide metrics for latency for packet send and
> receive of each network device.  I have some questions on implementing
> the performance metric names.  Thinking maybe somethingn like the
> following names:
> 
> network.interface.in.latency instance "devname"
> network.interface.out.latency instance "devname"
> 
> The value would be the average latency on the device.  This would be
> similar to the kernel.all.cpu.* metrics in the respect that the
> latencies would be average over some window of time.  Would it be
> better to provide raw monotonic sums of the delays and number of
> network packets in the pmda and have pcp derive the average latency
> from the raw metric values?  This would allow arbtrary window times to
> be used by pcp.  So for some time t and delta between measurements could do:
> 
> (latency_sum[t]-latency_sum[t-delta])/(packets[t]-packets[t-delta])
> 
> The systemtap script could use systemtap PROCFS probes to make it to
> read that information out when pcp desires it [*].  Maybe something
> that echos the /proc/net/dev format (there might be more latency
> fields in there to give a finer grained picture where the packet
> spends its time:
> 
> Inter-|   Receive       |  Transmit
>  face | packets latency | packets latency
> wlp3s0:       0       0     0          0
>     lo: 1738527854 1373704   1738527854  1373704
> virbr0-nic:       0       0    0    0
> virbr0:       0       0    0    0
>    em1: 87683319  105450    11920860   62401
> 
> Any thoughts or comments about this proposed network latency PMDA?
> 
> -Will
> 
> 
> [+] 
> http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/tools/perf/scripts/python/netdev-times.py
> [*] 
> https://sourceware.org/systemtap/langref/Probe_points.html#SECTION00057000000000000000
> 
> _______________________________________________
> pcp mailing list
> pcp@xxxxxxxxxxx
> http://oss.sgi.com/mailman/listinfo/pcp
> 


Hi All,

I have been experimenting with the tracepoint for the transmit portion of 
networking.  I have a script that tracks when a syscall queues up something for 
transmission (kernel.trace("net_dev_queue")).  It then tracks when the object 
is sent to hardware (kernel.trace("net_dev_xmit")) and finally freed 
(kernel.trace(kernel.trace("kfree_skb") and kernel.trace.trace("consume_skb")). 
 A user get the statistics out from /proc/systemtap/*/net_latency with the 
expectation this might be a way that pcp extracts data from the script. 
Something like:

$ cat /proc/systemtap/stap_df43d0122ca9ec5271896401487121a6_9305/net_latency 
#dev: tx_queue_n tx_queue_avg tx_queue_sum tx_xmit_n tx_xmit_avg tx_xmit_sum 
tx_free_n tx tx_free_avg tx_free_sum
em1: 89 3277 291671 89 3284 292340 83 3536 293549
lo: 9236 7 69453 9257 1948 18033760 140 96020 13442850

The script also currently prints out histograms on exit for debugging purposes 
and to give a feel what the distributions of times are, but I don't expect that 
would be used by pcp.

The receive is a bit more complicated because of the async nature of interrupts 
for incoming packets and the syscall reading the data, but I think that I 
understand enough of that to code something similar for receives.

Attached is the current linux_xmit_latency.stp script. Any comments on the 
current script would be appreciated.

-Will

Attachment: linux_xmit_latency.stp
Description: Text document

<Prev in Thread] Current Thread [Next in Thread>