On Tue, 7 Jun 2005, David S. Miller wrote:
> There also seems to be some misconceptions about changing the weight
> value. It actually improves the performance of other drivers as well.
> Not as much as it improves the e1000 performance but it does seem to
> help others as well.
One reason it helps e1000 more, which Robert Olsson mentioned, could
be the HW irq mitigation settings used by the e1000 driver. Lowering
these would be a good test.
Well, first a little more data. The machine in question is a dual xeon
running 2.6.12-rc5 or 2.6.12-rc4-supertso
with the 2.6.12-rc5 kernel (the old) tso promptly shuts down after a SACK,
and after that point the machine is CPU bound at 100%. This is the point
that we start to drop packets at the hardware level.
I tried the experiment today where I replenish buffers to hardware every
16 packets or so. This appears to mitigate all drops at the hardware
level (no drops). We're still at 100% with the rc5 kernel, however.
even with this replenish fix, the addition of dropping the weight to 16
helped increase our throughput, although only about 1%.
On the other hand, taking our driver as is with no changes and running the
supertso (not the split out version, yet) kernel, we show no dropped
packets and 60% cpu use. This combines with a 6% increase in throughput,
and the data pattern on the wire is much more constant (i have tcpdumps,
do you want to see them Dave?)
I'm looking forward to trying the split out patches tomorrow.
here is my (compile tested) patch, for e1000
diff -rup e1000-6.0.60.orig/src/e1000_main.c e1000-6.0.60/src/e1000_main.c
--- e1000-6.0.60.orig/src/e1000_main.c 2005-06-07 19:07:37.000000000 -0700
+++ e1000-6.0.60/src/e1000_main.c 2005-06-07 19:15:05.000000000 -0700
@@ -3074,11 +3074,14 @@ e1000_clean_rx_irq(struct e1000_adapter
next_desc:
rx_desc->status = 0;
buffer_info->skb = NULL;
+ if(unlikely((i & ~(E1000_RX_BUFFER_WRITE - 1)) == i))
+ adapter->alloc_rx_buf(adapter);
if(unlikely(++i == rx_ring->count)) i = 0;
rx_desc = E1000_RX_DESC(*rx_ring, i);
}
rx_ring->next_to_clean = i;
+ /* not sure this is necessary any more, but its safe */
adapter->alloc_rx_buf(adapter);
return cleaned;
@@ -3209,12 +3212,15 @@ e1000_clean_rx_irq_ps(struct e1000_adapt
next_desc:
rx_desc->wb.middle.status_error &= ~0xFF;
buffer_info->skb = NULL;
+ if(unlikely((i & ~(E1000_RX_BUFFER_WRITE - 1)) == i))
+ adapter->alloc_rx_buf(adapter);
if(unlikely(++i == rx_ring->count)) i = 0;
rx_desc = E1000_RX_DESC_PS(*rx_ring, i);
staterr = le32_to_cpu(rx_desc->wb.middle.status_error);
}
rx_ring->next_to_clean = i;
+ /* not sure this is necessary any more, but its safe */
adapter->alloc_rx_buf(adapter);
return cleaned;
PS e1000-6.0.60 is posted on sf.net/projects/e1000 now.
|