The following patch set adds hardware offload of the crypto operations for
IPv4 IPSEC processing. It gives a noticible speedup on my (admittedly older)
hardware, but given the recent numbers posted, can be a speedup even for
more recent hardware.
There are a few known issues with the current patchset, but I think it is
ready for wider review. This version incorporates a few comments from the
mailing lists, and removes GFP_ATOMIC allocations.
* Only the 3Com 3CR990 family of NICs are supported. I don't have hardware
or documentation for the Intel cards.
* Doesn't do IPv6. Need someone to implement map_direction(), and
AH/ESP handling, as well as come up with a card that supports it.
* linux/skbuff.h cannot include net/xfrm.h currently, so there are
redundant defines (requires some header cleanup, which I'm not
very inclined to tackle at the moment.)
* TCP Segmentation offload seems broken by firmware 03.001.008. It could be
my changes to support the offload, but that seems unlikely. I will
have to investigate this.
* Latency suffers somewhat on smaller packets, it may be advisable to have
a minimum packet size to offload.
* No real feedback on which xfrm_states have been offloaded or not.
* No individual control over which xfrm_states are allowed to be offloaded.
The patch set will be sent as follow-ups to this post, or is available via:
bk pull http://typhoon.bkbits.net/ipsec-2.6
If you have pulled from there before, you will need to reclone, as the
repository has been recreated.
It will update the following files:
Documentation/networking/netdevices.txt | 16
drivers/net/typhoon.c | 720 +++++++++++++++++++++++++++++++-
drivers/net/typhoon.h | 38 +
include/linux/ethtool.h | 8
include/linux/netdevice.h | 10
include/linux/skbuff.h | 55 ++
include/net/dst.h | 2
include/net/xfrm.h | 102 ++++
net/core/ethtool.c | 54 ++
net/core/skbuff.c | 31 +
net/ipv4/ah4.c | 99 ++--
net/ipv4/esp4.c | 102 ++--
net/ipv4/xfrm4_state.c | 9
net/ipv6/xfrm6_state.c | 10
net/xfrm/xfrm_export.c | 4
net/xfrm/xfrm_policy.c | 85 +++
net/xfrm/xfrm_state.c | 99 ++++
17 files changed, 1329 insertions(+), 115 deletions(-)
If you work from the mailed patches, you will want the netdev-2.6 updates
to the typhoon driver, as the 3CR990B series needs the newest firmware to
correctly offload IPSEC processing. That patch is available from
The following results were generated using a dual processor PIII 1GHz/512MB
with a 3CR990SVR97 (ori) and an Athlon 550 MHz/256MB with a 3CR990B (tank).
Latency testing was performed with lmbench's lat_tcp, and bandwith testing
was performed with Andrew Morton's zcc/zcs/cyclesoak. I ran the tests
multiple times, and picked the median results to report. There was not much
deviation in the results (+/- 1.5 us +/- 50KBytes/s +/- 1.5% CPU usage).
TCP Latency tests (1 byte msg)
No IPSEC 196 us
AH/SHA1 (sw) 256 us
AH/SHA1 (hw) 317 us
ESP/3DES,SHA1 (sw) 333 us
ESP/3DES,SHA1 (hw) 347 us
ESP-AH/3DES,SHA1-SHA1 (sw) 387 us
ESP-AH/3DES,SHA1-SHA1 (hw) 467 us
TCP Latency tests (1024 byte msg)
No IPSEC 625 us
AH/SHA1 (sw) 771 us
AH/SHA1 (hw) 858 us
ESP/3DES,SHA1 (sw) 1999 us
ESP/3DES,SHA1 (hw) 902 us
ESP-AH/3DES,SHA1-SHA1 (sw) 2140 us
ESP-AH/3DES,SHA1-SHA1 (hw) 1131 us
Config (sender -> receiver) Bandwidth ori CPU tank CPU
No IPSEC (tank->ori) 11494 KB/s 11.9% 18.7%
No IPSEC (ori->tank) 11492 KB/s 9.5% 34.3%
AH/SHA1 (sw) (tank->ori) 11303 KB/s 29.2% 79.3%
AH/SHA1 (sw) (ori->tank) 11302 KB/s 28.6% 91.1%
ESP/3DES,SHA1 (sw) (tank->ori) 2130 KB/s 29.6% 100%
ESP/3DES,SHA1 (sw) (ori->tank) 2263 KB/s 29.3% 99.7%
ESP-AH/3DES,SHA1-SHA1 (sw) (tank->ori) 1906 KB/s 29.1% 100%
ESP-AH/3DES,SHA1-SHA1 (sw) (ori->tank) 2051 KB/s 29.3% 99.7%
AH/SHA1 (hw) (tank->ori) 11303 KB/s 14.0% 30.2%
AH/SHA1 (hw) (ori->tank) 11301 KB/s 14.1% 39.8%
ESP/3DES,SHA1 (hw) (tank->ori) 11221 KB/s 15.4% 44.9%
ESP/3DES,SHA1 (hw) (ori->tank) 11220 KB/s 21.5% 48.1%
ESP-AH/3DES,SHA1-SHA1 (hw) (tank->ori) 5920 KB/s 10.8% 35.9%
ESP-AH/3DES,SHA1-SHA1 (hw) (ori->tank) 7189 KB/s 14.3% 35.4%
The last line seems suspicious, and should probably be retested.