Marco Mellia writes:
> > Touching the packet-data givs a major impact. See eth_type_trans
> > in all profiles.
> That's exactly what we removed from the driver code: touching the packet
> limit the reception rate at about 1.1Mpps, while avoiding to check the
> eth_type_trans actually allows to receive 100% of packets.
> skb are de/allocated using standard kernel memory management. Still,
> without touching the packet, we can receive 100% of them.
Right. I recall I tried something similar but as I only have pktgen
as sender I could only verify this to pktgen TX speed about 860 kpps
for PIII box I mentioned. This w. UP and one NIC.
> When IP-forwarding is considered, no more we hit the transmission limit
> (using NAPI, and your buffer recycling patch, as mentioned on the paper
> and on the slides... If no buffer recycling is adopted, performance drop
> a bit)
> So it seemd to us that the major bottleneck is due to the transmission
> Again, you can get numbers and more details from
Nice. Seems we getting close to click w. NAPI and recycling. The skb
recycling is outdated as it adds to much complexity to the kernel. I got
some idea how make a much more lighweight variant... If you feel hacking
I can outline the idea so you can try it.
> > OK. Good to know about e1000. Networking is most DMA's and CPU is used
> > adminstating it this is the challange.
> That's true. There is still the chance that the limit is due to hardware
> CRC calculation (which must be added to the ethernet frame by the
> nic...). But we're quite confortable that that is not the limit, since
> in the reception path the same operation must be performed...
> > Even you could try to fill TX as soon as the HW says there are available
> > buffers. This could even be done from TX-interrupt.
> Are you suggesting to modify packetgen to be more aggressive?
Well it could be useful at least as an experiment. Our lab would be
> > Small packet performance is dependent on low latency. Higher bus speed
> > gives shorter latency but also on higher speed buses there use to be
> > bridges that adds latency.
> That's true. We suspect that the limit is due to bus latency. But still,
> we are surprised, since the bus allows to receive 100%, but to transmit
> up to ~50%. Moreover the raw aggerate bandwidth of the buffer is _far_
> larger (133MHz*64bit ~ 8gbit/s
Have a look at graph in the pktgen paper presented at Linux-Kongress in
Erlangen 2004. It seems like even at 8gbit/s thsi is limiting small
packet TX performance.