Hello all,
In testing TCP throughput with a pair of National Semiconductor based gige
cards (using zerocopy send and receive) the kernel seems to be performing
far too many context switchs in the receiver, leading to only a 75MB/s
transfer rate. sys.net.ipv.tcp_[rw]mem on both machines is set to ~32MB
right now. The transmit program inner loop is basically:
while ((i = sendfile(... 16KB)) > 0 && (sent += i) < 2000000000);
which works out quite well. The receiver code looks like:
while (read(s, buf, 2048*1024) > 0);
I tried various buffer sizes from 4K to 32MB, but the impact on
performance was negligable. What is constant is that there are 2 context
switches per interrupt -> odd. Here's some vmstat output during a run:
sender
1 0 0 0 3831264 134320 10364 0 0 0 0 2575 26 1 24 75
0 0 0 0 3831264 134320 10364 0 0 0 0 2554 28 0 35 65
0 0 0 0 3831264 134320 10364 0 0 0 0 2552 27 0 73 27
0 0 0 0 3831264 134320 10364 0 0 0 8 2545 29 0 17 83
receiver
1 0 0 0 84660 1372 12628 0 0 0 0 2629 5015 0 85 15
1 0 0 0 84660 1372 12628 0 0 0 0 2626 5015 2 54 44
1 0 0 0 84660 1372 12628 0 0 0 0 2627 5015 0 96 4
1 0 0 0 84660 1372 12628 0 0 0 0 2625 5010 0 87 13
vmstat shows that the transmitter only wakes up a couple of dozen times
per second -- about what's expected given the size of the tcp window. The
receiver is another story entirely. Does anyone have any idea as to what
might be going on? This is with 2.4.9-ac2, but 2.4.8-ac6 shows the same
behaviour. One of the 2.4.3 kernels I tried (ia64) seems to be much
quicker.
-ben
|