Received: with ECARTIS (v1.0.0; list netdev); Mon, 01 Nov 2004 23:55:19 -0800 (PST) Received: from eagle.ericsson.se (eagle.ericsson.se [193.180.251.53]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iA27tCkw004604 for ; Mon, 1 Nov 2004 23:55:13 -0800 Received: from esealmw143.al.sw.ericsson.se ([153.88.254.118]) by eagle.ericsson.se (8.12.10/8.12.10/WIREfire-1.8b) with ESMTP id iA27ssR2020434 for ; Tue, 2 Nov 2004 08:54:55 +0100 Received: from esealnt613.al.sw.ericsson.se ([153.88.254.125]) by esealmw143.al.sw.ericsson.se with Microsoft SMTPSVC(6.0.3790.211); Tue, 2 Nov 2004 08:54:54 +0100 Received: from unixmail.ted.dk.eu.ericsson.se (knud.ted.dk.eu.ericsson.se [213.159.188.246]) by esealnt613.al.sw.ericsson.se with SMTP (Microsoft Exchange Internet Mail Service Version 5.5.2657.72) id WB1ZD7FQ; Tue, 2 Nov 2004 08:54:54 +0100 Received: from diadem.ted.dk.eu.ericsson.se (diadem.ted.dk.eu.ericsson.se [213.159.189.76]) by unixmail.ted.dk.eu.ericsson.se (8.10.1/8.10.1/TEDmain-1.0) with ESMTP id iA27sj321903; Tue, 2 Nov 2004 08:54:52 +0100 (MET) X-Sybari-Trust: dc62f736 8cefd49f 03c8e4c8 00000138 From: Michael Vittrup Larsen Organization: Ericsson To: Stephen Hemminger Subject: Re: [PATCH] tcp: efficient port randomisation Date: Tue, 2 Nov 2004 09:54:44 +0200 User-Agent: KMail/1.7 Cc: "David S. Miller" , netdev@oss.sgi.com References: <20041027092531.78fe438c@guest-251-240.pdx.osdl.net> <200411011058.23141.michael.vittrup.larsen@ericsson.com> <20041101092027.2a741e82@zqx3.pdx.osdl.net> In-Reply-To: <20041101092027.2a741e82@zqx3.pdx.osdl.net> MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Disposition: inline Message-Id: <200411020854.44745.michael.vittrup.larsen@ericsson.com> X-OriginalArrivalTime: 02 Nov 2004 07:54:54.0939 (UTC) FILETIME=[3DA822B0:01C4C0B1] Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by oss.sgi.com id iA27tCkw004604 X-archive-position: 11337 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: michael.vittrup.larsen@ericsson.com Precedence: bulk X-list: netdev Content-Length: 2741 Lines: 55 On Monday 01 November 2004 18:20, Stephen Hemminger wrote: > > * It is probably a good strategy to set 'tcp_rover_next' such that > >   the next search is resumed from the previous port found to be free. > >   (similar to the old algorithm).  I don't see this in your patch, > >   but of course I could have missed it. > > It was intentional since it would require holding a lock around the search. > The tradeoff is better SMP performance in the sparsely filled port space > (more typical) vs. better UP performance in the case of a mostly full port > space. I think a typical scenario is many short-lived (e.g. minutes) TCP connections, few long-lived (e.g. hours) connections and an ephemeral port wrap-around probably also in hours - at least a long time compared to the life-time of the short-lived connections. This would result in a closely spaced 'group' of ports being occupied somewhere in the ephemeral port range, and 'tcp_rover_next' would point at the uppermost extreme of this group and thus always guarantee a free port on first try (collisions will only happen with long-lived connections). If you don't update 'tcp_rover_next', and this somehow gets to lag behind this 'group' of ports (say point at the lower extreme) you will need to search through this group first before you enter the unoccupied port space. Your scheme works initially because you do not lag behind the free port space, but eventually you will, and I think this will result in less optimal performance compared to the old behaviour. Since updating the 'tcp_rover_next' practically always result in a free port on first try, I think SMP performance will not suffer even though the lock was held all through the port search (except when the port space is very crowded). And yes, I do use Linux exclusively, so I do care :-)) From a statistically point of view, if the connection life-times are uniformly distributed from zero to infinite (theoretical scenario), it does not matter what starting point you use. However, soon as life-times are not uniformly distributed, this kind of search algorithm will benefit from good starting point defining where the probability of used vs. unused port drop from high to low. The BSD solution with a pure random rover suffers similarly, especially when the port space becomes crowded. > > * connect_port_offset() does not (at least from an algorithm point > >   of view) need to return an u32, an u16 is sufficient. > > If it is truncated to u16, then compiler has to take extra effort to > truncate is unnecessary given later  modulo operation. I agree (in fact thats what I argued in the draft) - it probably depends on your platform - you are assuming a 32-bit platform I guess.