Marco Mellia writes:
> > Touching the packet-data givs a major impact. See eth_type_trans
> > in all profiles.
>
> That's exactly what we removed from the driver code: touching the packet
> limit the reception rate at about 1.1Mpps, while avoiding to check the
> eth_type_trans actually allows to receive 100% of packets.
>
> skb are de/allocated using standard kernel memory management. Still,
> without touching the packet, we can receive 100% of them.
Right. I recall I tried something similar but as I only have pktgen
as sender I could only verify this to pktgen TX speed about 860 kpps
for PIII box I mentioned. This w. UP and one NIC.
> When IP-forwarding is considered, no more we hit the transmission limit
> (using NAPI, and your buffer recycling patch, as mentioned on the paper
> and on the slides... If no buffer recycling is adopted, performance drop
> a bit)
> So it seemd to us that the major bottleneck is due to the transmission
> limit.
>
> Again, you can get numbers and more details from
>
> http://www.tlc-networks.polito.it/~mellia/euroTLC.pdf
> http://www.tlc-networks.polito.it/mellia/papers/Euro_qos_ip.pdf
Nice. Seems we getting close to click w. NAPI and recycling. The skb
recycling is outdated as it adds to much complexity to the kernel. I got
some idea how make a much more lighweight variant... If you feel hacking
I can outline the idea so you can try it.
> > OK. Good to know about e1000. Networking is most DMA's and CPU is used
> > adminstating it this is the challange.
>
> That's true. There is still the chance that the limit is due to hardware
> CRC calculation (which must be added to the ethernet frame by the
> nic...). But we're quite confortable that that is not the limit, since
> in the reception path the same operation must be performed...
OK!
> > Even you could try to fill TX as soon as the HW says there are available
> > buffers. This could even be done from TX-interrupt.
>
> Are you suggesting to modify packetgen to be more aggressive?
Well it could be useful at least as an experiment. Our lab would be
happy...
> > Small packet performance is dependent on low latency. Higher bus speed
> > gives shorter latency but also on higher speed buses there use to be
> > bridges that adds latency.
>
> That's true. We suspect that the limit is due to bus latency. But still,
> we are surprised, since the bus allows to receive 100%, but to transmit
> up to ~50%. Moreover the raw aggerate bandwidth of the buffer is _far_
> larger (133MHz*64bit ~ 8gbit/s
Have a look at graph in the pktgen paper presented at Linux-Kongress in
Erlangen 2004. It seems like even at 8gbit/s thsi is limiting small
packet TX performance.
ftp://robur.slu.se/pub/Linux/net-development/pktgen-testing/pktgen_paper.pdf
--ro
|