netdev
[Top] [All Lists]

RE: The ultimate TOE design

To: Leonid Grossman <leonid.grossman@xxxxxxxx>
Subject: RE: The ultimate TOE design
From: jamal <hadi@xxxxxxxxxx>
Date: 16 Sep 2004 05:57:54 -0400
Cc: "'Jeff Garzik'" <jgarzik@xxxxxxxxx>, "'David S. Miller'" <davem@xxxxxxxxxxxxx>, alan@xxxxxxxxxxxxxxxxxxx, paul@xxxxxxxx, netdev@xxxxxxxxxxx, linux-kernel@xxxxxxxxxxxxxxx
In-reply-to: <200409160525.i8G5PtqG009228@xxxxxxxxxxxxxxxxx>
Organization: jamalopolous
References: <200409160525.i8G5PtqG009228@xxxxxxxxxxxxxxxxx>
Reply-to: hadi@xxxxxxxxxx
Sender: netdev-bounce@xxxxxxxxxxx
On Thu, 2004-09-16 at 01:25, Leonid Grossman wrote:
>  
> > -----Original Message-----
> > From: jamal [mailto:hadi@xxxxxxxxxx] 

> > On a serious note, I think that PCI-express (if it lives upto its
> > expectation) will demolish dreams of a lot of these TOE investments.
> > Our problem is NOT the CPU right now (80% idle processing 
> > 450Kpps forwarding). Bus and memory distance/latency are. 
> 
> In servers, both bottlenecks are there - if you look at the cost of TCP and
> filesystem processing at 10GbE, CPU is a huge problem (and will be for
> foreseeable future), even for fastest 64-bit systems. 

True, but with the bus contention being a non-issue you got more of that
xeon being available for use (lets say i can use 50% more of its
capacity then i can do more). IOW, it becomes a compute capacity problem
mostly - one that you should in theory be able to throw more CPU at. SMT
(the way power5 and some of the network processors do it[1]) should go a
long way to address both additional compute and hardware threading to
work around memory latencies. With PCI-express, compute power in
mini-clustering in the form of AS (http://www.asi-sig.org/home) is being
plotted as we speak.
To sumarize: The problem to solve in 24 months maybe 100Gige.

> I agree though that bus and memory are bigger issues, this is exactly the
> reason for all these RDMA over Ethernet investments :-)

And AS does a damn good job at specing all those RDMA requirements; my
view is that intel is going to build them chips - so it can be done on a
$5 board off the pacific rim. This takes most of the small players out
of the market.

> Anyways, did not mean to start an argument - with all the new CPU, bus and
> HBA technologies coming to the market it will be another 18-24 months before
> we know what works and what doesn't...

Agreed. Would you like to invest on something that will obsoleted in
18-24 months though? OR even not obsoleted, but holds that uncertainty?
I think thats the risk facing you when you are in the offload bussiness.

Here are results for Hifn 7956 ref board on 2.6GHz P4 (HT) system,
kernel  2.6.6 SMP as compared to a s/ware only setup on same machine.
[Name of tester withheld to protect privacy].

first column - algo, second - packet size, third - 
time in us spend by hw crypto, forth - time in us spent by sw crypto:

des   64:       28      3
des  128:       29      6
des  192:       33      9
des  256:       33      12
des  320:       37      15
des  384:       38      18
des  448:       41      21
des  512:       42      23
des  576:       45      26
des  640:       46      29
des  704:       49      33
des  768:       50      35
des  832:       53      38
des  896:       54      41
des  960:       57      44
des 1024:       58      47
des 1088:       61      50
des 1152:       62      53
des 1216:       66      56
des 1280:       66      59
des 1344:       70      62
des 1408:       71      65
des 1472:       74      68
des3_ede   64:  28      6
des3_ede  128:  30      13
des3_ede  192:  34      20
des3_ede  256:  43      26
des3_ede  320:  38      33
des3_ede  384:  48      40
des3_ede  448:  44      45
des3_ede  512:  54      53
des3_ede  576:  50      60
des3_ede  640:  59      67
des3_ede  704:  55      74
des3_ede  768:  66      78
des3_ede  832:  61      85
des3_ede  896:  72      94
des3_ede  960:  67      100
des3_ede 1024:  77      107
des3_ede 1088:  73      114
des3_ede 1152:  82      121
des3_ede 1216:  79      127
des3_ede 1280:  88      128
des3_ede 1344:  84      135
des3_ede 1408:  94      147
des3_ede 1472:  90      153
aes   64:       28      2
aes  192:       33      6
aes  320:       37      10
aes  448:       46      15
aes  576:       53      19
aes  704:       53      23
aes  832:       65      28
aes  960:       66      32
aes 1088:       71      37
aes 1216:       80      41
aes 1344:       83      45
aes 1472:       92      50

Moral of the data above: The 2.6Ghz is already showing signs of
obsoleting the hifn crypto offloader[2]. I think it took less than a
year for it to happen.

cheers,
jamal

[1] I also like the MIPS.com approach to SMT

[2] There are actually issues with some of the crypto offloading in
Linux; however this does serve as a good example.


<Prev in Thread] Current Thread [Next in Thread>