More measurements

To: netdev@xxxxxxxxxxx
Subject: More measurements
From: Andrew Morton <andrewm@xxxxxxxxxx>
Date: Tue, 30 Jan 2001 01:04:10 +1100
Sender: owner-netdev@xxxxxxxxxxx
Let's compare 3coms and eepro100s.  And the effects of MMIO
versus PIO, and other stuff.

                                                    3c905C 3c905C  3c905C 
3c905C 3c905C    eepro100  eepro100  eepro100
                                                     CPU   affine   ints  rx 
pps tx pps      MMIO     I/O ops   ints

    2.4.1-pre10+zerocopy, sendfile():                9.6%           4395   4106 
  8146      15.3%
    2.4.1-pre10+zerocopy, send():                   24.1%           4449   4163 
  8196      20.2%
    2.4.1-pre10+zerocopy, receiving:                18.7%           12332  8156 
  4189      17.6%

    2.4.1-pre10+zerocopy, sendfile(), no xsum/SG:   16.2%                       
    2.4.1-pre10+zerocopy, send(), no xsum/SG:       21.5%                       

    2.4.1-pre10-vanilla, using sendfile():          17.1%  17.9%    5729   5296 
  8214      16.1%     16.8%
    2.4.1-pre10-vanilla, using send():              21.1%           4629   4152 
  8191      20.3%     20.6%     6310
    2.4.1-pre10-vanilla, receiving:                 18.3%          12333   8156 
  4188      17.1%     18.2%    12335

Lots of interesting things here.

- eepro100 generates more interrupts doing TCP Tx, but not
  TCP Rx.  I assume it doesn't do Tx mitigation?

- Changing eepro100 to use IO operations instead of MMIO slows
  down this dual 500MHz machine by less than one percent at
  100 mbps.  At 12,000 interrupts per second. Why all the fuss
  about MMIO?

- Bonding the 905's interrupt to CPU0 slows things down slightly.
  (This is contrary to other measurements I've previously taken.
   Don't pay any attention to this).

- Without the zc patch, there is a significant increase (25%) in
  the number of Rx packets (acks, persumably) when data is sent
  using sendfile() as opposed to when the same data is sent
  with send().

  Workload: 62 files, average size 350k.
            sendfile() tries to send the entire file in one hit
            send() breaks it up into 64kbyte chunks.

  When the zerocopy patch is applied, the Rx packet rate during
  sendfile() is the same as the rate during send().

  Why is this?

  If this *alone* were fixed in 2.4.1, I'd expect the performance
  gain to be ~10% of system capacity on this NIC.

- I see a consistent 12-13% slowdown on send() with the zerocopy
  patch.  Can this be fixed?

