"David S. Miller" <davem@xxxxxxxxxxxxx> writes:
> This shows up in testing where the connection is application limited.
> For example, an "scp" goes more slowly over TSO now, there are less
> cpu cycles available for the encryption.
> It's tricky to come up with a scheme to fix this. I would love to be
> able to not do the page grabs/releases in the actual TSO frame. I
> really haven't come up with a clean way to do that however.
Are you sure a few atomic_inc/dec are really causing noticeable
slowdown? That would surprise me unless you have lots of cache line
bouncing on a MP system.
What CPU did you test it on? Does it happen with only a single CPU?
And did you actually see them in some profile?
Assuming the struct page is in cache the P4 core is the slowest at
that that I know, but even on that one it should be in the noise on
the other overhead of talking to a NIC on a PCI bus.
Perhaps it is something else..