From owner-stp@oss.sgi.com Wed Mar 1 10:12:41 2000 Received: by oss.sgi.com id ; Wed, 1 Mar 2000 10:12:31 -0800 Received: from mars.iol.unh.edu ([132.177.121.222]:54565 "EHLO iol.unh.edu") by oss.sgi.com with ESMTP id ; Wed, 1 Mar 2000 10:12:08 -0800 Received: from localhost (cwl@localhost) by iol.unh.edu (8.9.3/8.9.3) with ESMTP id NAA15494; Wed, 1 Mar 2000 13:11:35 -0500 (EST) Date: Wed, 1 Mar 2000 13:11:35 -0500 From: Chris Loveland To: Aman Singla cc: stp@oss.sgi.com Subject: Re: scsi on stp In-Reply-To: <38BC77C4.EF66A13@engr.sgi.com> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-stp@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;stp-outgoing thank you very much for your previous response, it has let me to have a few more questions. On Tue, 29 Feb 2000, Aman Singla wrote: > Chris, > > The data hierarchy associated with Long Message transfers in STP > is transfer->block->STU; a transfer consists of one or many blocks > and a block consists of one or many STUs. > > A SCSI transaction maps to a STP transfer. The STP stack takes care > of retransmissions for missed/dropped blocks based on timeouts (or > any other mechanism like missed ordering of blocks etc.). what part of the standard defines this error recovery? i was looking at 10.3 of the ST standard which says the timing out of this operation is the responability of the ULP, which in this case is SCSI. then the ST standard says that error recovery happens at the level of the transaction. > Further > the NIC h/w or firmware, if capable, may take care of STU retransmission > for dropped STUs. For example on GbE a frame would correspond to a > STU, and lets say a block corresponds to 64 STUs; now, if a frame is > dropped/lost, the media/physical layer/NIC, if capable, could have > the remote NIC resend the STU - generally resulting in the protocol > stack on the host always getting all the blocks; in the context of gig ethernet what defines this potential retransmision done by the nic itself? is this something defined by STP? are you talking about some other protocol running on gig ethernet sitting below STP doing this retransmision? > but if the NIC can't > support STU retransmission, the protocol stack will observe a dropped > block (a block isn't deemed recd. until all STUs are recd) and would > request retransmission of the block by resending a CTS. The entire > transfer will (generally speaking) never have to be redone. > do you envision the transmission of a CTS being initiated by the host or by the nic itself? how is it anticipated that SCSI transactions would be broken up into blocks. do you picture a SCSI transaction basically to consist of a block or two or something closer to one block = one frame? i would think the CTS would have to be initiated by the host itself, not the nic. if this is the case then it would probably be prefereable to minimize the number of blocks per transaction in order to minimize the host/nic interaction. this would mean that a resent CTS corresponds to the retransmission of a large number of individual frames. chris From owner-stp@oss.sgi.com Wed Mar 1 19:34:42 2000 Received: by oss.sgi.com id ; Wed, 1 Mar 2000 19:34:32 -0800 Received: from pneumatic-tube.sgi.com ([204.94.214.22]:1356 "EHLO pneumatic-tube.sgi.com") by oss.sgi.com with ESMTP id ; Wed, 1 Mar 2000 19:34:18 -0800 Received: from dsl-lhotse.corp.sgi.com ([192.132.126.170]) by pneumatic-tube.sgi.com (980327.SGI.8.8.8-aspam/980310.SGI-aspam) via SMTP id TAA03940 for ; Wed, 1 Mar 2000 19:37:26 -0800 (PST) mail_from (aman@engr.sgi.com) Received: from engr.sgi.com (localhost [127.0.0.1]) by dsl-lhotse.corp.sgi.com (950413.SGI.8.6.12/970903.SGI.AUTOCF) via ESMTP id DAA07934; Thu, 2 Mar 2000 03:31:18 GMT Message-ID: <38BDE086.F2FF5185@engr.sgi.com> Date: Wed, 01 Mar 2000 19:31:18 -0800 From: Aman Singla Organization: Silicon Graphics, Inc. X-Mailer: Mozilla 4.51C-SGI [en] (X11; I; IRIX 6.2 IP22) X-Accept-Language: en MIME-Version: 1.0 To: Chris Loveland CC: stp@oss.sgi.com Subject: Re: scsi on stp References: Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Sender: owner-stp@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;stp-outgoing > what part of the standard defines this error recovery? i was looking at > 10.3 of the ST standard which says the timing out of this operation is the > responability of the ULP, which in this case is SCSI. then the ST > standard says that error recovery happens at the level of the transaction. If you look at the state tables for the various FSMs you'll see that the actions required to be triggered at various timeouts are clearly defined. The ULP case may apply to transactions outside of sequences covered in various FSMs. SCSI uses the Read/Write sequences - which are defined. > in the context of gig ethernet what defines this potential retransmision > done by the nic itself? is this something defined by STP? are you > talking about some other protocol running on gig ethernet sitting below > STP doing this retransmision? The GbE nic doesn't do any retransmission; in the general case, such transactions would be beyond the scope of STP (they would be extremely media dependent - like micropackets on GSN..; the implication to STP would be that it sits on a reliable physical medium) > do you envision the transmission of a CTS being initiated by the host or > by the nic itself? how is it anticipated that SCSI transactions would be > broken up into blocks. do you picture a SCSI transaction basically to > consist of a block or two or something closer to one block = one frame? > i would think the CTS would have to be initiated by the host itself, not > the nic. if this is the case then it would probably be prefereable to > minimize the number of blocks per transaction in order to minimize the > host/nic interaction. this would mean that a resent CTS corresponds to > the retransmission of a large number of individual frames. CTS retransmission will be initiated by the host only! One could implement the whole stack in the NIC - but thats not a desirable/ useful option. The concept of blocks implies the granularity at which the host operates the protocol stack. The block size used is a tradeoff between the desire the minimize the host/nic interaction, and cost of retransmission on errors. Block size to be used is determined by the configuration of the NIC, and options submitted by the ULP. The ULP doesn't have to chunk the transfer into blocks; for read/write sequences that part is handled by the protocol stack (the FSM). :a From owner-stp@oss.sgi.com Tue Mar 14 11:08:28 2000 Received: by oss.sgi.com id ; Tue, 14 Mar 2000 11:08:17 -0800 Received: from deliverator.sgi.com ([204.94.214.10]:48983 "EHLO deliverator.sgi.com") by oss.sgi.com with ESMTP id ; Tue, 14 Mar 2000 11:08:06 -0800 Received: from lhotse.engr.sgi.com (lhotse.engr.sgi.com [163.154.35.41]) by deliverator.sgi.com (980309.SGI.8.8.8-aspam-6.2/980310.SGI-aspam) via ESMTP id LAA12793 for ; Tue, 14 Mar 2000 11:03:29 -0800 (PST) mail_from (aman@cthulhu.engr.sgi.com) Received: from engr.sgi.com (localhost [127.0.0.1]) by lhotse.engr.sgi.com (980427.SGI.8.8.8/970903.SGI.AUTOCF) via ESMTP id LAA40460; Tue, 14 Mar 2000 11:06:48 -0800 (PST) Message-ID: <38CE8DC7.6EDAFD63@engr.sgi.com> Date: Tue, 14 Mar 2000 11:06:47 -0800 From: Aman Singla Organization: SGI X-Mailer: Mozilla 4.7C-SGI [en] (X11; I; IRIX 6.5 IP32) X-Accept-Language: en MIME-Version: 1.0 To: "Audet, Martin" CC: stp@oss.sgi.com Subject: Re: MPI implementation over STP ? References: Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Sender: owner-stp@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;stp-outgoing Sorry for the delay in answering your question Martin. We're very aware of the fact that running MPI on STP makes a lot of sense. The only MPI implementation on STP that I'm aware of is within SGI - on Irix platforms. I'm still a couple of months away from getting the 'feature set' of STP which is required for MPI working on Linux and GbE. Once that is done, we're very interested in working to get MPI on STP - in form of efforts within SGI, and working with anyone interested on the outside also - e.g. if somebody were to explain to me the requirements MPICH would have from STP, we could work towards delivering them. Hope this answers your question. thanks, :a > > Hi, > > I saw your article in Linux Weekly News about STP on Linux > (e.g.: http://lwn.net/2000/0217/a/stp.html). > > Can you tell me if you know any project of using MPI over > STP ? > > Since STP allows high bandwidth, low latency user space > communication and since it support Gigabit Ethernet > (and we can expect a price drop for GbE devices), it > would opens the door to a new generation of high > performance low cost Beowulf parallel systems. > > Martin From owner-stp@oss.sgi.com Tue Mar 14 12:42:18 2000 Received: by oss.sgi.com id ; Tue, 14 Mar 2000 12:42:09 -0800 Received: from gate.imi.nrc.ca ([206.167.202.2]:36623 "EHLO imi.nrc.ca") by oss.sgi.com with ESMTP id ; Tue, 14 Mar 2000 12:41:47 -0800 Received: by nrcbouex1.imi.nrc.ca with Internet Mail Service (5.5.2650.21) id ; Tue, 14 Mar 2000 15:41:51 -0500 Message-ID: From: "Audet, Martin" To: "'lusk.mcs.anl.gov'" Cc: "'stp@oss.sgi.com'" Subject: FW: MPI implementation over STP ? Date: Tue, 14 Mar 2000 15:41:49 -0500 MIME-Version: 1.0 X-Mailer: Internet Mail Service (5.5.2650.21) Content-Type: text/plain; charset="iso-8859-1" Sender: owner-stp@oss.sgi.com Precedence: bulk Return-Path: X-Orcpt: rfc822;stp-outgoing Dear MPICH gurus, I forward you this message since I think it is relevant for you. As you maybe know SGI plan to release it's implementation of the Scheduled Transfer Protocol (STP) under GLP as part of it's Open Source Strategy. see: http://lwn.net/2000/0217/a/stp.html and http://oss.sgi.com/projects This protocol was developed for high speed networks like 800 MByte/s HPPI or Gigabit Ethernet. It was designed for both low latency user space communications and high bandwidth with minimal host CPU usage. I think VIA can also be implemented over STP. Since the price of Gigabit Ethernet devices can be expected to drop and since hardware acceleration for STP can also be expected, STP and a proper MPI library implementation could bring high performance to commodity based Beowulf parallel systems by replacing the usual TCP/IP stack. I hope this sounds interesting for you... Martin =========================================================================== Martin Audet Industrial Materials Institute E-mail: martin.audet@nrc.ca National Research Council Tel : (450) 641-5034 75, de Mortagne, Boucherville, (Que) (450) 641-5364 (lab.) J4B 6Y4 Fax : (450) 641-5106 Canada -----Original Message----- From: Aman Singla To: Audet, Martin Cc: stp@oss.sgi.com Sent: 3/14/00 2:06 PM Subject: Re: MPI implementation over STP ? Sorry for the delay in answering your question Martin. We're very aware of the fact that running MPI on STP makes a lot of sense. The only MPI implementation on STP that I'm aware of is within SGI - on Irix platforms. I'm still a couple of months away from getting the 'feature set' of STP which is required for MPI working on Linux and GbE. Once that is done, we're very interested in working to get MPI on STP - in form of efforts within SGI, and working with anyone interested on the outside also - e.g. if somebody were to explain to me the requirements MPICH would have from STP, we could work towards delivering them. Hope this answers your question. thanks, :a > > Hi, > > I saw your article in Linux Weekly News about STP on Linux > (e.g.: http://lwn.net/2000/0217/a/stp.html). > > Can you tell me if you know any project of using MPI over > STP ? > > Since STP allows high bandwidth, low latency user space > communication and since it support Gigabit Ethernet > (and we can expect a price drop for GbE devices), it > would opens the door to a new generation of high > performance low cost Beowulf parallel systems. > > Martin