The "more expensive" write/send in zerocopy is a known cost of paged
SKBs. This cost may be decreased a bit with some fine tuning, but not
What do we get for this cost?
Basically, the big win is not that the card checksums the packet.
We could get that for free while copying the data from userspace
into the kernel pages during the sendmsg(), using the combined
"copy+checksum" hand-coded assembly routines we already have.
It is in fact the better use of memory. Firstly, we use page
allocations, only single ones. With linear buffers SLAB could
use multiple pages which strain the memory subsystem quite a bit at
times. Secondly, we fill pages with socket data precisely whereas
SLAB can only get as tight packing as any general purpose memory
This, I feel, outweighs the slight performance decrease. And I would
wager a bet that the better usage of memory will result in better
all around performance.
The problem with microscopic tests is that you do not see the world
around the thing being focused on. I feel Andrew/Jamal's test are
very valuable, but lets keep things in perspective when doing cost
Finally, please do some tests on loopback. It is usually a great
way to get "pure software overhead" measurements of our TCP stack.
David S. Miller