From owner-stp@oss.sgi.com Wed Nov 1 07:01:41 2000 Received: by oss.sgi.com id ; Wed, 1 Nov 2000 07:01:31 -0800 Received: from laime.cs.uchicago.edu ([128.135.11.244]:20429 "EHLO laime.cs.uchicago.edu") by oss.sgi.com with ESMTP id ; Wed, 1 Nov 2000 07:01:14 -0800 Received: from candide.cs.uchicago.edu (candide.cs.uchicago.edu [128.135.11.62]) by laime.cs.uchicago.edu (8.10.2/8.9.3) with SMTP id eA1F1DE10741 for ; Wed, 1 Nov 2000 09:01:13 -0600 (CST) Received: by candide.cs.uchicago.edu (5.57/4.7) id AA01421; Wed, 1 Nov 00 08:59:24 -0600 Message-Id: <10011011459.AA01421@candide.cs.uchicago.edu> To: stp@oss.sgi.com Subject: Re: Network perforance and low CPU usage In-Reply-To: Message from "John Thorp" of "Tue, 31 Oct 2000 16:02:43 GMT." <00bb01c04354$00fd7ba0$316e28c3@servalan.org> Date: Wed, 01 Nov 2000 09:00:06 -0600 From: Stephen Bailey Sender: owner-stp@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;stp-outgoing John, > I have just started looking at a project which needs to process high data > volumes. STP is an appropriate choice. However, note that it is not the protocol per se that makes STP efficient (high speed data transfer with low CPU use). It is the fact that the protocol is designed to enable hardware acceleration. When you use hardware acceleration, you get high efficiency. With pure software, the efficiency is somewhat better than TCP, but the protocol doesn't do quite as much for you as TCP either. > The idea is to string together a set of machines connected by a > Gigabit network. A typical data rate would be a sustained 350Mbits/s > input and output for each machine. Pekka (Pekka.Pietikainen@cern.ch) published some numbers a little while ago which were ~47 MB/s using 1-2% CPU. The source was a 2x500 and the sink a 1x400. If you're willing to `cheat' and use Ethernet jumbo frames, you can get all the way up to 102MB/s with mebe 5% CPU. SGI has done ~790 MB/s (!) on GSN (roughly 10 GigE speed) using < 10% CPU on a run of the mill Origin 2000 (and we all know how slow the CPUs are on those babies :^) That's the fastest wire that ST runs on. It was expected that the protocol would scale at least to the equivalent of 100 GigE. The trick when you start crawling up to a significant portion of your memory bandwidth with I/O traffic is to remember that application code which used to take 5% CPU to execute can start taking a LOT MORE CPU due to memory contention. 80 MB/s of I/O traffic is going to have substantial additional main memory consumption overhead. Whether it'll really interfere depends upon the design of your box. > We are looking at using the Netgear GA620 network card - is this the > best? You must use a card which has ST acceleration. Currently the Alteon, is the only GigE card that has it. Ask Pekka for the particulars. The Alteon card is using firmware acceleration, and the onboard processor/code is a bit slow, which is why it is limited to 47 MB/s with 1500 byte ethernet frames. Somebody in England (I seem to have lost the reference) did custom Alteon firmware for a similar project and got wire speed performance. There was some mention of trying to adapt that firmware to accelerating ST. Again, Pekka can probably tell you where the bodies are buried with that. For what you're trying to do, the limitations of the existing firmware may or may not be OK. It's not clear from your message whether you're planning to go in and out on the same interface. I would guess so, in which case, if you don't use jumbo frames, the Alteon firmware may run out of gas. Steph From owner-stp@oss.sgi.com Wed Nov 1 09:38:41 2000 Received: by oss.sgi.com id ; Wed, 1 Nov 2000 09:38:31 -0800 Received: from smtp1.cern.ch ([137.138.128.38]:20236 "EHLO smtp1.cern.ch") by oss.sgi.com with ESMTP id ; Wed, 1 Nov 2000 09:38:25 -0800 Received: from lxplus014.cern.ch (IDENT:root@lxplus014.cern.ch [137.138.161.113]) by smtp1.cern.ch (8.9.3/8.9.3) with ESMTP id SAA20938 for ; Wed, 1 Nov 2000 18:38:17 +0100 (MET) Received: from localhost (ppieta@localhost) by lxplus014.cern.ch (8.9.3/8.9.3) with SMTP id SAA10721 for ; Wed, 1 Nov 2000 18:38:16 +0100 X-Authentication-Warning: lxplus014.cern.ch: ppieta owned process doing -bs Date: Wed, 1 Nov 2000 18:38:12 +0100 (CET) From: Pekka Pietikainen X-Sender: ppieta@lxplus014.cern.ch Reply-To: Pekka Pietikainen To: stp@oss.sgi.com Subject: Re: Network perforance and low CPU usage In-Reply-To: <10011011459.AA01421@candide.cs.uchicago.edu> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-stp@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;stp-outgoing > The trick when you start crawling up to a significant portion of your > memory bandwidth with I/O traffic is to remember that application code > which used to take 5% CPU to execute can start taking a LOT MORE CPU > due to memory contention. 80 MB/s of I/O traffic is going to have > substantial additional main memory consumption overhead. Whether > it'll really interfere depends upon the design of your box. This is from the box in the middle doing reads and writes of 256k. I hacked up the test setup from stuff I found lying around (including an old Tigon-I, since I didn't have enough Tigon-II's) With jumbo frames # description host sample_KB total_MB sample_KB/s avge_KB/s cpu_sec user_sec sys_sec sec/MB cpu_pct 1 sink toy3 209715.203 2306.867 45367.223 45554.204 0.130 0.010 0.120 0.001 3 1 sink toy3 209715.203 2516.583 45540.930 45551.712 0.160 0.000 0.160 0.001 3 1 sink toy3 209715.203 2726.298 45582.977 45545.238 0.120 0.000 0.120 0.001 3 1 sink toy3 209715.203 2936.013 45600.402 45551.686 0.200 0.010 0.190 0.001 4 1 sink toy3 209715.203 3145.728 45577.043 45562.543 0.190 0.010 0.180 0.001 4 Without them around 28M/s. The machine does have two PCI buses (L440GX), but I have no idea how they're actually wired. Also used a non-SMP kernel on the machines by accident as I just noticed trying to figure out if binding the NIC's to different CPU's would make any difference. > > > We are looking at using the Netgear GA620 network card - is this the > > best? > > You must use a card which has ST acceleration. Currently the Alteon, > is the only GigE card that has it. Ask Pekka for the particulars. Anything based on the Alteon should be fine. The Netgear only has 512k of memory, which seems to be a bit tight with the STP acceleration included, so cards with 1MB like 3c985B might be a better choice. I'll definately try to get it working well with 512k boards too, especially since the Netgears seem to be quite a bit cheaper... > with 1500 byte ethernet frames. Somebody in England (I seem to have > lost the reference) did custom Alteon firmware for a similar project > and got wire speed performance. There was some mention of trying to > adapt that firmware to accelerating ST. Again, Pekka can probably > tell you where the bodies are buried with that. http://www.cl.cam.ac.uk/Research/SRG/netos/arsenic/ From owner-stp@oss.sgi.com Thu Nov 16 12:33:57 2000 Received: by oss.sgi.com id ; Thu, 16 Nov 2000 12:33:47 -0800 Received: from deliverator.sgi.com ([204.94.214.10]:52771 "EHLO deliverator.sgi.com") by oss.sgi.com with ESMTP id ; Thu, 16 Nov 2000 12:33:32 -0800 Received: from nodin.corp.sgi.com (fddi-nodin.corp.sgi.com [198.29.75.193]) by deliverator.sgi.com (980309.SGI.8.8.8-aspam-6.2/980310.SGI-aspam) via ESMTP id MAA06400 for ; Thu, 16 Nov 2000 12:25:39 -0800 (PST) mail_from (voellm@sgi.com) Received: from steel.nova.sgi.com (steel.nova.sgi.com [169.238.28.22]) by nodin.corp.sgi.com (980427.SGI.8.8.8/980728.SGI.AUTOCF) via ESMTP id MAA94285 for ; Thu, 16 Nov 2000 12:33:00 -0800 (PST) Received: from freedom.nova.sgi.com by steel.nova.sgi.com via ESMTP (980427.SGI.8.8.8/930416.SGI) id PAA29141; Thu, 16 Nov 2000 15:30:23 -0500 (EST) Date: Thu, 16 Nov 2000 15:26:34 -0500 From: "Anthony F. Voellm" X-Sender: voellm@freedom.nova.sgi.com To: Douglas Johnson cc: stp@oss.sgi.com Subject: Re: stp status In-Reply-To: Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-stp@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;stp-outgoing Doug, No prolem with the list. It was too large so majordomo rejected it. Please send pointers to where the code can be ftped. It was greater than 40,000 bytes (the limit). Take care, Tony Anthony F. Voellm Technical Lead - High Performance Network Engineering 14160 Newbrook Dr., Suite 100 Chantilly, VA 20151 (703) 227-8527 Fax (703) 227-8500 http://reality.sgi.com/voellm_nova/ On Thu, 16 Nov 2000, Douglas Johnson wrote: > Hi, > I sent a message out to the stp mailing list a few hours ago and > it has not returned yet. And this got me thinking, is there a problem with > the mailing list (might explain it's low volume)? Also, are the linux stp > authors still with SGI? There's a correspondence between the cray split > dates and the last release of software. > > Thanks, > -- > Doug Johnson > Systems Developer/Engineer > The Ohio Supercomputer Center > djohnson@osc.edu > From owner-stp@oss.sgi.com Thu Nov 16 12:47:57 2000 Received: by oss.sgi.com id ; Thu, 16 Nov 2000 12:47:37 -0800 Received: from atlantis.osc.edu ([192.148.249.4]:13786 "EHLO osc.edu") by oss.sgi.com with ESMTP id ; Thu, 16 Nov 2000 12:47:29 -0800 Received: from neptune.osc.edu (IDENT:djohnson@neptune.osc.edu [192.148.249.73]) by osc.edu (8.9.3/8.9.3/OSC 2.0) with ESMTP id PAA20147; Thu, 16 Nov 2000 15:47:24 -0500 (EST) Date: Thu, 16 Nov 2000 15:47:24 -0500 (EST) From: Douglas Johnson X-Sender: djohnson@localhost.localdomain To: "Anthony F. Voellm" cc: stp@oss.sgi.com Subject: Re: stp status In-Reply-To: Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-stp@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;stp-outgoing Ok, to restate, I've forward-ported the kernel patch for stp from the 2.3.99pre2 version available from the website to version 2.4.0-test10. The patch can be downloaded from, http://www.osc.edu/~djohnson/stp/patch_stp-0.32a_lk-2.4.0-pre10.gz The one thing that might be an issue with this patch is the use of skb_set_owner_w instead of skb_set_owner_c which no longer exists. Using the _w routine after the clone in the send routines should be safe though. The diff is against a clean 2.4.0-test10. Does anyone have some info on who the main engineers that will be working on this at SGI? Are there any sites out there that are looking at working on this code? We're very interested in having an stp implementation on linux. Doug On Thu, 16 Nov 2000, Anthony F. Voellm wrote: > Doug, > > No prolem with the list. It was too large so majordomo rejected it. > Please send pointers to where the code can be ftped. It was greater than > 40,000 bytes (the limit). > > Take care, > Tony > > Anthony F. Voellm > Technical Lead - High Performance Network Engineering > 14160 Newbrook Dr., Suite 100 > Chantilly, VA 20151 > (703) 227-8527 Fax (703) 227-8500 > http://reality.sgi.com/voellm_nova/ > > On Thu, 16 Nov 2000, Douglas Johnson wrote: > > > Hi, > > I sent a message out to the stp mailing list a few hours ago and > > it has not returned yet. And this got me thinking, is there a problem with > > the mailing list (might explain it's low volume)? Also, are the linux stp > > authors still with SGI? There's a correspondence between the cray split > > dates and the last release of software. > > > > Thanks, > > -- > > Doug Johnson > > Systems Developer/Engineer > > The Ohio Supercomputer Center > > djohnson@osc.edu > > > -- Doug Johnson Systems Developer/Engineer The Ohio Supercomputer Center djohnson@osc.edu From owner-stp@oss.sgi.com Fri Nov 17 06:32:43 2000 Received: by oss.sgi.com id ; Fri, 17 Nov 2000 06:32:33 -0800 Received: from smtp1.cern.ch ([137.138.128.38]:56842 "EHLO smtp1.cern.ch") by oss.sgi.com with ESMTP id ; Fri, 17 Nov 2000 06:32:23 -0800 Received: from lxplus006.cern.ch (IDENT:root@lxplus006.cern.ch [137.138.161.121]) by smtp1.cern.ch (8.9.3/8.9.3) with ESMTP id PAA26455; Fri, 17 Nov 2000 15:31:44 +0100 (MET) Received: from localhost (ppieta@localhost) by lxplus006.cern.ch (8.9.3/8.9.3) with SMTP id PAA00941; Fri, 17 Nov 2000 15:31:42 +0100 X-Authentication-Warning: lxplus006.cern.ch: ppieta owned process doing -bs Date: Fri, 17 Nov 2000 15:31:42 +0100 (CET) From: Pekka Pietikainen X-Sender: ppieta@lxplus006.cern.ch Reply-To: Pekka Pietikainen To: stp@oss.sgi.com, Douglas Johnson Subject: Re: stp status In-Reply-To: Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-stp@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;stp-outgoing > http://www.osc.edu/~djohnson/stp/patch_stp-0.32a_lk-2.4.0-pre10.gz > > The diff is against a clean 2.4.0-test10. Does anyone have some info on > who the main engineers that will be working on this at SGI? Are there any > sites out there that are looking at working on this code? We're very > interested in having an stp implementation on linux. Hi I am working at CERN on the Linux STP stuff for my Master's thesis, and have worked on the code quite a bit. I've put an updated version of a patch I sent to Anthony a while ago at http://ppieta.home.cern.ch/ppieta/stpdiff-2.4.0test10.gz Main changes since the 0.32a-2.3.99pre2 release - port to 2.4.0test10 - when using estp and the direct tx hack, the length field wasn't set properly - ST accelerated acenic driver renamed to acenic_stp so you can easily switch between drivers. - ST accelerated firmware loads and works (Tigon-II with 1MB required for best performance, things are a bit tight with 512k, but it should work with those too), also added some DMA avoidance patches done by the Linux zero-copy sendfile project(*) which seem to help performance (around 10-15% with TCP and non-jumbo ST, jumbo ST went down a few percent) - a failed assert() doesn't panic the machine anymore (so I don't have to run downstairs every time I hit a bug, but instead can just unload/reload the module =) ) - The code didn't work when using it on a kernel with IPv6 compiled in, stvd/estp.c used MAX_HEADER, which is bigger with IPv6 in the kernel and thus there was a buffer overrun causing nice crashes. Changed it to LL_MAX_HEADER (which is the same as MAX_HEADER on non-IPv6/token ring configurations) (*) ftp://ftp.inr.ac.ru/ip-routing/zerocopy-sendfile-001113.dif.gz and acenic-sg-001110.tar.gz They are currently adding support for zero-copy transmits to drivers that can support it, so as soon as that stuff has stabilized I intend to change STP to use that instead of the current direct_data_ptr. From owner-stp@oss.sgi.com Tue Nov 28 08:34:25 2000 Received: by oss.sgi.com id ; Tue, 28 Nov 2000 08:34:15 -0800 Received: from smtp1.cern.ch ([137.138.128.38]:50952 "EHLO smtp1.cern.ch") by oss.sgi.com with ESMTP id ; Tue, 28 Nov 2000 08:33:54 -0800 Received: from lxplus007.cern.ch (IDENT:root@lxplus007.cern.ch [137.138.161.120]) by smtp1.cern.ch (8.9.3/8.9.3) with ESMTP id RAA06981 for ; Tue, 28 Nov 2000 17:33:47 +0100 (MET) Received: from localhost (ppieta@localhost) by lxplus007.cern.ch (8.9.3/8.9.3) with ESMTP id RAA30212 for ; Tue, 28 Nov 2000 17:33:47 +0100 X-Authentication-Warning: lxplus007.cern.ch: ppieta owned process doing -bs Date: Tue, 28 Nov 2000 17:33:47 +0100 (CET) From: Pekka Pietikainen X-Sender: ppieta@lxplus007.cern.ch To: stp@oss.sgi.com Subject: Some new stuff Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-stp@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;stp-outgoing I've made some changes to the firmware a bit more to work much better with cards with 512k, you can find the binary at http://ppieta.home.cern.ch/ppieta/alt_fw2.h Performance seems to be not much worse than on 1MB cards, except that the since there is only about 220k left on the board for receive buffers, 256k blocks won't work (you can do ~= 200k writes just fine though and get almost the same performance). I suppose I should add something into the acenic driver for preventing the use of too big blocks... From owner-stp@oss.sgi.com Tue Nov 28 08:47:05 2000 Received: by oss.sgi.com id ; Tue, 28 Nov 2000 08:46:55 -0800 Received: from pneumatic-tube.sgi.com ([204.94.214.22]:58662 "EHLO pneumatic-tube.sgi.com") by oss.sgi.com with ESMTP id ; Tue, 28 Nov 2000 08:46:39 -0800 Received: from steel.nova.sgi.com (steel.nova.sgi.com [169.238.28.22]) by pneumatic-tube.sgi.com (980327.SGI.8.8.8-aspam/980310.SGI-aspam) via ESMTP id IAA01519 for ; Tue, 28 Nov 2000 08:54:39 -0800 (PST) mail_from (voellm@sgi.com) Received: from freedom.nova.sgi.com by steel.nova.sgi.com via ESMTP (980427.SGI.8.8.8/930416.SGI) for id LAA05793; Tue, 28 Nov 2000 11:45:21 -0500 (EST) Date: Tue, 28 Nov 2000 11:40:57 -0500 From: "Anthony F. Voellm" X-Sender: voellm@freedom.nova.sgi.com To: stp@oss.sgi.com Subject: Welcome Bryan Bush In-Reply-To: Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-stp@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;stp-outgoing Hello All, I just wanted to let you know that Bryan Bush (bbush@sgi.com) will be integrating all the latest source submissions to SGI. Please fell free to contact him. Right now Bryan is working on Pekka's and Doug's submissions. Take care, Tony From owner-stp@oss.sgi.com Wed Nov 29 08:34:02 2000 Received: by oss.sgi.com id ; Wed, 29 Nov 2000 08:33:52 -0800 Received: from atlantis.osc.edu ([192.148.249.4]:23263 "EHLO osc.edu") by oss.sgi.com with ESMTP id ; Wed, 29 Nov 2000 08:33:32 -0800 Received: from neptune.osc.edu (IDENT:djohnson@neptune.osc.edu [192.148.249.73]) by osc.edu (8.9.3/8.9.3/OSC 2.0) with ESMTP id LAA12852; Wed, 29 Nov 2000 11:32:55 -0500 (EST) Date: Wed, 29 Nov 2000 11:32:55 -0500 (EST) From: Douglas Johnson X-Sender: djohnson@localhost.localdomain To: "Anthony F. Voellm" cc: stp@oss.sgi.com Subject: Re: Welcome Bryan Bush In-Reply-To: Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-stp@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;stp-outgoing Pekka's submission for a patch to the newer kernels should be used in place of mine. I will be submitting some things to clean up the patch a little in a few days. Doug On Tue, 28 Nov 2000, Anthony F. Voellm wrote: > Hello All, > > I just wanted to let you know that Bryan Bush (bbush@sgi.com) will be > integrating all the latest source submissions to SGI. Please fell free to > contact him. > > Right now Bryan is working on Pekka's and Doug's submissions. > > Take care, > Tony > > > >