From cfriesen@nortelnetworks.com Sat Mar 1 22:03:44 2003 Received: with ECARTIS (v1.0.0; list netdev); Sat, 01 Mar 2003 22:03:47 -0800 (PST) Received: from zcars0m9.nortelnetworks.com (zcars0m9.nortelnetworks.com [47.129.242.157]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id h2263heA017610 for ; Sat, 1 Mar 2003 22:03:44 -0800 Received: from zcard309.ca.nortel.com (zcard309.ca.nortel.com [47.129.242.69]) by zcars0m9.nortelnetworks.com (Switch-2.2.0/Switch-2.2.0) with ESMTP id h22634q11232; Sun, 2 Mar 2003 01:03:04 -0500 (EST) Received: from zcard0k6.ca.nortel.com ([47.129.242.158]) by zcard309.ca.nortel.com with SMTP (Microsoft Exchange Internet Mail Service Version 5.5.2653.13) id GDF4RWX4; Sun, 2 Mar 2003 01:03:05 -0500 Received: from pcard0ks.ca.nortel.com ([47.129.117.131]) by zcard0k6.ca.nortel.com with SMTP (Microsoft Exchange Internet Mail Service Version 5.5.2653.13) id FSL7YWNZ; Sun, 2 Mar 2003 01:03:05 -0500 Received: from nortelnetworks.com (localhost.localdomain [127.0.0.1]) by pcard0ks.ca.nortel.com (Postfix) with ESMTP id 2E9602E12F; Sun, 2 Mar 2003 01:03:04 -0500 (EST) Message-ID: <3E619E97.8010508@nortelnetworks.com> Date: Sun, 02 Mar 2003 01:03:03 -0500 X-Sybari-Space: 00000000 00000000 00000000 From: Chris Friesen User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:0.9.8) Gecko/20020204 X-Accept-Language: en-us MIME-Version: 1.0 To: jamal Cc: linux-kernel@vger.kernel.org, netdev@oss.sgi.com, linux-net@vger.kernel.org Subject: Re: anyone ever done multicast AF_UNIX sockets? References: <3E5E7081.6020704@nortelnetworks.com> <20030228083009.Y53276@shell.cyberus.ca> <3E5F748E.2080605@nortelnetworks.com> <20030228212309.C57212@shell.cyberus.ca> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 1828 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: cfriesen@nortelnetworks.com Precedence: bulk X-list: netdev Content-Length: 3163 Lines: 72 jamal wrote: > Did you also measure throughput? No. lmbench doesn't appear to test UDP socket local throughput. > You are overlooking the flexibility that already exists in IP based > transports as an advantage; the fact that you can make them > distributed instead of localized with a simple addressing change > is a very powerful abstraction. True. On the other hand, the same could be said about unicast IP sockets vs unix sockets. Unix sockets exist for a reason, and I'm simply proposing to extend them. >>From >>userspace, multicast unix would be *simple* to use, as in totally >>transparent. > You could implement the abstraction in user space as a library today by > having some server that muxes to several registered clients. This is what we have now, though with a suboptimal solution (we inherited it from another group). The disadvantage with this is that it adds a send/schedule/receive iteration. If you have a small number of listeners this can have a large effect percentage-wise on your messaging cost. The kernel approach also cuts the number of syscalls required by a factor of two compared to the server-based approach. > So whats the addressing scheme for multicast unix? Would it be a > reserved path? Actually I was thinking it could be arbitrary, with a flag in the unix part of struct sock saying that it was actually a multicast address. The api would be something like the IP multicast one, where you get and bind a normal socket and then use setsockopt to attach yourself to one or more of multicast addresses. A given address could be multicast or not, but they would reside in the same namespace and would collide as currently happens. The only way to create a multicast address would be the setsockopt call--if the address doesn't already exist a socket would be created by the kernel and bound to the desired address. To see if its feasable I've actually coded up a proof-of-concept that seems to do fairly well. I tested it with a process sending an 8-byte packet containing a timestamp to three listeners, who checked the time on receipt and printed out the difference. For comparison I have two different userspace implementations, one with a server process (very simple for test purposes) and the other using an mmap'd file to store which process is listening to what messages. The timings (in usec) for the delays to each of the listeners were as follows on my duron 750: userspace server: 104 133 153 userspace no server: 72 111 138 kernelspace: 60 91 113 As you can see, the kernelspace code is the fastest and since its in the kernel it can be written to avoid being scheduled out while holding locks which is hard to avoid with the no-server userspace option. If this sounds at all interesting I would be glad to post a patch so you could shoot holes in it, otherwise I'll continue working on it privately. Chris -- Chris Friesen | MailStop: 043/33/F10 Nortel Networks | work: (613) 765-0557 3500 Carling Avenue | fax: (613) 765-2986 Nepean, ON K2H 8E9 Canada | email: cfriesen@nortelnetworks.com From hadi@cyberus.ca Sun Mar 2 06:12:17 2003 Received: with ECARTIS (v1.0.0; list netdev); Sun, 02 Mar 2003 06:12:29 -0800 (PST) Received: from mx03.cyberus.ca (mx03.cyberus.ca [216.191.240.24]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id h22ECAeA028572 for ; Sun, 2 Mar 2003 06:12:16 -0800 Received: from shell.cyberus.ca ([216.191.240.114]) by mx03.cyberus.ca with esmtp (Exim 4.10) id 18pUCD-0000nI-00; Sun, 02 Mar 2003 09:12:05 -0500 Received: from shell.cyberus.ca (localhost.cyberus.ca [127.0.0.1]) by shell.cyberus.ca (8.12.6/8.12.6) with ESMTP id h22EBeYO063470; Sun, 2 Mar 2003 09:11:40 -0500 (EST) (envelope-from hadi@cyberus.ca) Received: from localhost (hadi@localhost) by shell.cyberus.ca (8.12.6/8.12.6/Submit) with ESMTP id h22EBdxn063467; Sun, 2 Mar 2003 09:11:40 -0500 (EST) (envelope-from hadi@cyberus.ca) X-Authentication-Warning: shell.cyberus.ca: hadi owned process doing -bs Date: Sun, 2 Mar 2003 09:11:39 -0500 (EST) From: jamal To: Chris Friesen cc: linux-kernel@vger.kernel.org, "" , "" Subject: Re: anyone ever done multicast AF_UNIX sockets? In-Reply-To: <3E619E97.8010508@nortelnetworks.com> Message-ID: <20030302081916.S61365@shell.cyberus.ca> References: <3E5E7081.6020704@nortelnetworks.com> <20030228083009.Y53276@shell.cyberus.ca> <3E5F748E.2080605@nortelnetworks.com> <20030228212309.C57212@shell.cyberus.ca> <3E619E97.8010508@nortelnetworks.com> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-archive-position: 1829 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: hadi@cyberus.ca Precedence: bulk X-list: netdev Content-Length: 3825 Lines: 94 On Sun, 2 Mar 2003, Chris Friesen wrote: > jamal wrote: > > Did you also measure throughput? > > No. lmbench doesn't appear to test UDP socket local throughput. I think you need to collect all data if you are trying to show improvements. > > > You are overlooking the flexibility that already exists in IP based > > transports as an advantage; the fact that you can make them > > distributed instead of localized with a simple addressing change > > is a very powerful abstraction. > > True. On the other hand, the same could be said about unicast IP > sockets vs unix sockets. Unix sockets exist for a reason, and I'm > simply proposing to extend them. > You are treading into areas where unix sockets make less sense compared to sockets. Good design rules (should actually read "lazy design rules") ometimes you gotta move to a round peg instead of trying to make the square one round. > > You could implement the abstraction in user space as a library today by > > having some server that muxes to several registered clients. > > This is what we have now, though with a suboptimal solution (we > inherited it from another group). The disadvantage with this is that it > adds a send/schedule/receive iteration. If you have a small number of > listeners this can have a large effect percentage-wise on your messaging > cost. The kernel approach also cuts the number of syscalls required by > a factor of two compared to the server-based approach. > Ok, so its only a problem when you have a few listeners i.e user space scheme scales just fine as you keep adding listeners. In your tests what was the break-even point? > > So whats the addressing scheme for multicast unix? Would it be a > > reserved path? > > Actually I was thinking it could be arbitrary, with a flag in the unix > part of struct sock saying that it was actually a multicast address. > The api would be something like the IP multicast one, where you get and > bind a normal socket and then use setsockopt to attach yourself to one > or more of multicast addresses. A given address could be multicast or > not, but they would reside in the same namespace and would collide as > currently happens. The only way to create a multicast address would be > the setsockopt call--if the address doesn't already exist a socket would > be created by the kernel and bound to the desired address. > Addressing has to be backwared compatible i.e not affecting any other program. > To see if its feasable I've actually coded up a proof-of-concept that > seems to do fairly well. I tested it with a process sending an 8-byte > packet containing a timestamp to three listeners, who checked the time > on receipt and printed out the difference. > > For comparison I have two different userspace implementations, one with > a server process (very simple for test purposes) and the other using an > mmap'd file to store which process is listening to what messages. > > The timings (in usec) for the delays to each of the listeners were as > follows on my duron 750: > > userspace server: 104 133 153 > userspace no server: 72 111 138 > kernelspace: 60 91 113 > > As you can see, the kernelspace code is the fastest and since its in the > kernel it can be written to avoid being scheduled out while holding > locks which is hard to avoid with the no-server userspace option. > Actually, the difference between user space server and kernel doesnt appear that big. What you need to do is collect more data. repeat with incrementing number of listeners. > If this sounds at all interesting I would be glad to post a patch so you > could shoot holes in it, otherwise I'll continue working on it privately. > no rush, lets see your test data first and then you gotta do a better sales job on the cost/benefit/flexibilty ratios. cheers, jamal From zjp@iscas.ac.cn Sun Mar 2 23:45:03 2003 Received: with ECARTIS (v1.0.0; list netdev); Sun, 02 Mar 2003 23:45:08 -0800 (PST) Received: from mail.iscas.ac.cn ([159.226.5.56]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id h237j2eA007487 for ; Sun, 2 Mar 2003 23:45:03 -0800 Received: (qmail 24812 invoked by uid 104); 3 Mar 2003 07:44:34 -0000 Received: from zjp@iscas.ac.cn by mail.iscas.ac.cn by uid 0 with qmail-scanner-1.14 (hbedv: 6.15.0.1. hbedv: operating system: Linux (glibc). hbedv: product version: 2.0.4. hbedv: engine version: 6.15.0.1. hbedv: packlib version: 2.0.0.8 (supports 19 formats). hbedv: vdf version: 6.15.0.7 (66928 recognized forms). hbedv: . hbedv: product: AntiVir Workstation. hbedv: key file: hbedv.key. hbedv: registered user: irene, 123. hbedv: serial number: 1001020203. hbedv: key expires: 31 May 2003. hbedv: run mode: PRIVATE. hbedv: . hbedv: product: AntiVir MailGate. hbedv: key file: hbedv.key. hbedv: registered user: irene, 123. hbedv: serial number: 1001020203. hbedv: key expires: 31 May 2003. hbedv: run mode: PRIVATE. hbedv: . hbedv: product: AntiVir (command line scanner). hbedv: key file: hbedv.key. hbedv: registered user: irene, 123. hbedv: serial number: 1001020203. hbedv: key expires: 31 May 2003. hbedv: run mode: PRIVATE. Clear:. Processed in 0.260838 secs); 03 Mar 2003 07:44:34 -0000 Received: from unknown (HELO zhengjp) (zjp@159.226.5.59) by mail.iscas.ac.cn with SMTP; 3 Mar 2003 07:44:33 -0000 Message-ID: <003901c2e159$5d932ae0$6c05a8c0@zhengjp> From: "Zheng Jianping" To: Subject: How to set IPv6 router alert option Date: Mon, 3 Mar 2003 15:49:05 +0800 MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 6.00.2800.1106 X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2800.1106 X-archive-position: 1830 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: zjp@iscas.ac.cn Precedence: bulk X-list: netdev Content-Length: 469 Lines: 19 Hi, I want to send a packet(MLD query message) with IPv6 router alert option by socket. After creating a ICMPv6 socket, how to send a MLD query message packet by the created socket. Thanks, Zheng Jianping ---------------------------------------------------------------------------- --------------------------- Multimedia Communication & Network Engneering Research Center Institue of Software, Chiese Academy of Sciences Email: zjp@iscas.ac.cn Tel: 6255,5523 From davem@redhat.com Mon Mar 3 01:02:49 2003 Received: with ECARTIS (v1.0.0; list netdev); Mon, 03 Mar 2003 01:03:02 -0800 (PST) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id h2392meA008716 for ; Mon, 3 Mar 2003 01:02:48 -0800 Received: from localhost (IDENT:davem@localhost.localdomain [127.0.0.1]) by pizda.ninka.net (8.9.3/8.9.3) with ESMTP id AAA00650; Mon, 3 Mar 2003 00:44:58 -0800 Date: Mon, 03 Mar 2003 00:44:57 -0800 (PST) Message-Id: <20030303.004457.24252283.davem@redhat.com> To: bwa@us.ibm.com Cc: lksctp-developers@lists.sourceforge.net, linux-net@vger.kernel.org, netdev@oss.sgi.com Subject: Re: [PATCH] subset of RFC2553 From: "David S. Miller" In-Reply-To: <1046109300.3503.12.camel@w-bwa1.beaverton.ibm.com> References: <1045847170.3104.7.camel@w-bwa1.beaverton.ibm.com> <20030221.232639.129509431.davem@redhat.com> <1046109300.3503.12.camel@w-bwa1.beaverton.ibm.com> X-FalunGong: Information control. X-Mailer: Mew version 2.1 on Emacs 21.1 / Mule 5.0 (SAKAKI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 1831 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev Content-Length: 425 Lines: 14 From: Bruce Allan Date: 24 Feb 2003 09:54:57 -0800 On Fri, 2003-02-21 at 23:26, David S. Miller wrote: > > Bruce, while applying this I noticed that in6addr_{any,loopback} > are not exported by modules. > > Please send me a small patch to add the exports if this will be > needed by SCTP and friends. Doh! Sorry, here (see below) it is against 2.5.59. Applied, thanks. From davem@redhat.com Mon Mar 3 01:12:17 2003 Received: with ECARTIS (v1.0.0; list netdev); Mon, 03 Mar 2003 01:12:23 -0800 (PST) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id h239CGeA009155 for ; Mon, 3 Mar 2003 01:12:17 -0800 Received: from localhost (IDENT:davem@localhost.localdomain [127.0.0.1]) by pizda.ninka.net (8.9.3/8.9.3) with ESMTP id AAA00686; Mon, 3 Mar 2003 00:54:29 -0800 Date: Mon, 03 Mar 2003 00:54:28 -0800 (PST) Message-Id: <20030303.005428.96142819.davem@redhat.com> To: yoshfuji@linux-ipv6.org Cc: linux-kernel@vger.kernel.org, netdev@oss.sgi.com, kuznet@ms2.inr.ac.ru, pekkas@netcore.fi, usagi@linux-ipv6.org Subject: Re: [PATCH] IPv6: Privacy Extensions for Stateless Address Autoconfiguration in IPv6 From: "David S. Miller" In-Reply-To: <20030226.004155.71903869.yoshfuji@linux-ipv6.org> References: <20021101.174832.44646503.yoshfuji@linux-ipv6.org> <20030223.223114.65976206.davem@redhat.com> <20030226.004155.71903869.yoshfuji@linux-ipv6.org> X-FalunGong: Information control. X-Mailer: Mew version 2.1 on Emacs 21.1 / Mule 5.0 (SAKAKI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=iso-2022-jp Content-Transfer-Encoding: 7bit X-archive-position: 1832 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev Content-Length: 267 Lines: 7 From: YOSHIFUJI Hideaki / $B5HF#1QL@(B Date: Wed, 26 Feb 2003 00:41:55 +0900 (JST) Well, I've found a bug that a temporary addresses were not re-generated properly. Here's the patch for linux-2.5.63. Fix applied, thanks. From davem@redhat.com Mon Mar 3 01:46:39 2003 Received: with ECARTIS (v1.0.0; list netdev); Mon, 03 Mar 2003 01:46:50 -0800 (PST) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id h239kceA024540 for ; Mon, 3 Mar 2003 01:46:39 -0800 Received: from localhost (IDENT:davem@localhost.localdomain [127.0.0.1]) by pizda.ninka.net (8.9.3/8.9.3) with ESMTP id BAA00846; Mon, 3 Mar 2003 01:28:26 -0800 Date: Mon, 03 Mar 2003 01:28:25 -0800 (PST) Message-Id: <20030303.012825.81834528.davem@redhat.com> To: latten@austin.ibm.com Cc: kuznet@ms2.inr.ac.ru, linux-kernel@vger.kernel.org, netdev@oss.sgi.com Subject: Re: PATCH: IPSec not using padding when Null Encryption From: "David S. Miller" In-Reply-To: <200302272129.h1RLTJW28434@faith.austin.ibm.com> References: <200302272129.h1RLTJW28434@faith.austin.ibm.com> X-FalunGong: Information control. X-Mailer: Mew version 2.1 on Emacs 21.1 / Mule 5.0 (SAKAKI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 1833 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev Content-Length: 397 Lines: 10 From: latten@austin.ibm.com Date: Thu, 27 Feb 2003 15:29:19 -0600 Ok, anyway, this fix just pretty much makes sure that when Null Encryption or any algorithm with a blocksize less than 4 is used, that the ciphertext, any padding, and next-header and pad-length fields terminate on a 4-byte boundary. I have tested it. Please let me know if all is well. Applied, thanks. From davem@redhat.com Mon Mar 3 01:47:53 2003 Received: with ECARTIS (v1.0.0; list netdev); Mon, 03 Mar 2003 01:47:58 -0800 (PST) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id h239lqeA028214 for ; Mon, 3 Mar 2003 01:47:53 -0800 Received: from localhost (IDENT:davem@localhost.localdomain [127.0.0.1]) by pizda.ninka.net (8.9.3/8.9.3) with ESMTP id BAA00857; Mon, 3 Mar 2003 01:30:13 -0800 Date: Mon, 03 Mar 2003 01:30:13 -0800 (PST) Message-Id: <20030303.013013.93812658.davem@redhat.com> To: yoshfuji@linux-ipv6.org Cc: linux-kernel@vger.kernel.org, netdev@oss.sgi.com, kuznet@ms2.inr.ac.ru, usagi@linux-ipv6.org Subject: Re: [PATCH] Use C99 initializers in net/ipv6 From: "David S. Miller" In-Reply-To: <20030228.065944.08980219.yoshfuji@linux-ipv6.org> References: <20030228.065944.08980219.yoshfuji@linux-ipv6.org> X-FalunGong: Information control. X-Mailer: Mew version 2.1 on Emacs 21.1 / Mule 5.0 (SAKAKI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=iso-2022-jp Content-Transfer-Encoding: 7bit X-archive-position: 1834 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev Content-Length: 316 Lines: 8 From: YOSHIFUJI Hideaki / $B5HF#1QL@(B Date: Fri, 28 Feb 2003 06:59:44 +0900 (JST) This convers net/ipv6/{addrconf,route,sit}.c files to use C99 initializers. We don't touch net/ipv6/exthdrs.c for now because it will conflicts with our patch for IPsec. Applied, thanks. From davem@redhat.com Mon Mar 3 01:52:41 2003 Received: with ECARTIS (v1.0.0; list netdev); Mon, 03 Mar 2003 01:52:46 -0800 (PST) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id h239qfeA001941 for ; Mon, 3 Mar 2003 01:52:41 -0800 Received: from localhost (IDENT:davem@localhost.localdomain [127.0.0.1]) by pizda.ninka.net (8.9.3/8.9.3) with ESMTP id BAA00880; Mon, 3 Mar 2003 01:34:52 -0800 Date: Mon, 03 Mar 2003 01:34:51 -0800 (PST) Message-Id: <20030303.013451.20307573.davem@redhat.com> To: jmorris@intercode.com.au Cc: toml@us.ibm.com, netdev@oss.sgi.com, kuznet@ms2.inr.ac.ru Subject: Re: IPSec: setkey -DP freezes machine From: "David S. Miller" In-Reply-To: References: X-FalunGong: Information control. X-Mailer: Mew version 2.1 on Emacs 21.1 / Mule 5.0 (SAKAKI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 1835 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev Content-Length: 427 Lines: 11 From: James Morris Date: Sat, 1 Mar 2003 03:01:04 +1100 (EST) Alternatively, a family parameter could be added to the compile_policy() operation, but this duplicates data already present in our native xfrm_userpolicy_info format. I like this solution, it seems the cleanest. Could someone implement this fix and send me the patch? I'm very backlogged for the next day or so... From jmorris@intercode.com.au Mon Mar 3 04:14:14 2003 Received: with ECARTIS (v1.0.0; list netdev); Mon, 03 Mar 2003 04:14:30 -0800 (PST) Received: from blackbird.intercode.com.au (blackbird.intercode.com.au [203.32.101.10]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id h23CEBeA007035 for ; Mon, 3 Mar 2003 04:14:13 -0800 Received: from localhost (jmorris@localhost) by blackbird.intercode.com.au (8.9.3/8.9.3) with ESMTP id XAA10331; Mon, 3 Mar 2003 23:13:55 +1100 Date: Mon, 3 Mar 2003 23:13:55 +1100 (EST) From: James Morris To: "David S. Miller" cc: toml@us.ibm.com, , Subject: [PATCH] Re: IPSec: setkey -DP freezes machine In-Reply-To: <20030303.013451.20307573.davem@redhat.com> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-archive-position: 1836 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jmorris@intercode.com.au Precedence: bulk X-list: netdev Content-Length: 3622 Lines: 106 On Mon, 3 Mar 2003, David S. Miller wrote: > Alternatively, a family parameter could be added to the compile_policy() > operation, but this duplicates data already present in our native > xfrm_userpolicy_info format. > > I like this solution, it seems the cleanest. > Ok, here's a patch which does this. I've also added check to verify_newpolicy_info() so that we don't run into the same problem for policies provided via the netlink interface. Tom, would you let me know if this works for you, as my racoon isn't working yet. - James -- James Morris diff -urN -X dontdiff linux-2.5.63.orig/include/net/xfrm.h linux-2.5.63.w1/include/net/xfrm.h --- linux-2.5.63.orig/include/net/xfrm.h Fri Feb 21 00:44:01 2003 +++ linux-2.5.63.w1/include/net/xfrm.h Mon Mar 3 22:19:40 2003 @@ -223,7 +223,7 @@ char *id; int (*notify)(struct xfrm_state *x, int event); int (*acquire)(struct xfrm_state *x, struct xfrm_tmpl *, struct xfrm_policy *xp, int dir); - struct xfrm_policy *(*compile_policy)(int opt, u8 *data, int len, int *dir); + struct xfrm_policy *(*compile_policy)(u16 family, int opt, u8 *data, int len, int *dir); }; extern int xfrm_register_km(struct xfrm_mgr *km); diff -urN -X dontdiff linux-2.5.63.orig/net/ipv4/xfrm_state.c linux-2.5.63.w1/net/ipv4/xfrm_state.c --- linux-2.5.63.orig/net/ipv4/xfrm_state.c Fri Feb 21 00:44:01 2003 +++ linux-2.5.63.w1/net/ipv4/xfrm_state.c Mon Mar 3 22:23:53 2003 @@ -680,7 +680,7 @@ err = -EINVAL; read_lock(&xfrm_km_lock); list_for_each_entry(km, &xfrm_km_list, list) { - pol = km->compile_policy(optname, data, optlen, &err); + pol = km->compile_policy(sk->family, optname, data, optlen, &err); if (err >= 0) break; } diff -urN -X dontdiff linux-2.5.63.orig/net/ipv4/xfrm_user.c linux-2.5.63.w1/net/ipv4/xfrm_user.c --- linux-2.5.63.orig/net/ipv4/xfrm_user.c Tue Feb 25 15:03:26 2003 +++ linux-2.5.63.w1/net/ipv4/xfrm_user.c Mon Mar 3 22:56:34 2003 @@ -538,6 +538,21 @@ return -EINVAL; }; + switch (p->family) { + case AF_INET: + break; + + case AF_INET6: +#if defined(CONFIG_IPV6) || defined(CONFIG_IPV6_MODULE) + break; +#else + return -EAFNOSUPPORT; +#endif + + default: + return -EINVAL; + }; + return verify_policy_dir(p->dir); } @@ -1057,7 +1072,8 @@ /* User gives us xfrm_user_policy_info followed by an array of 0 * or more templates. */ -struct xfrm_policy *xfrm_compile_policy(int opt, u8 *data, int len, int *dir) +struct xfrm_policy *xfrm_compile_policy(u16 family, int opt, + u8 *data, int len, int *dir) { struct xfrm_userpolicy_info *p = (struct xfrm_userpolicy_info *)data; struct xfrm_user_tmpl *ut = (struct xfrm_user_tmpl *) (p + 1); diff -urN -X dontdiff linux-2.5.63.orig/net/key/af_key.c linux-2.5.63.w1/net/key/af_key.c --- linux-2.5.63.orig/net/key/af_key.c Tue Feb 25 15:03:26 2003 +++ linux-2.5.63.w1/net/key/af_key.c Mon Mar 3 22:30:56 2003 @@ -2420,7 +2420,8 @@ return pfkey_broadcast(skb, GFP_ATOMIC, BROADCAST_REGISTERED, NULL); } -static struct xfrm_policy *pfkey_compile_policy(int opt, u8 *data, int len, int *dir) +static struct xfrm_policy *pfkey_compile_policy(u16 family, int opt, + u8 *data, int len, int *dir) { struct xfrm_policy *xp; struct sadb_x_policy *pol = (struct sadb_x_policy*)data; @@ -2451,6 +2452,7 @@ xp->lft.hard_byte_limit = XFRM_INF; xp->lft.soft_packet_limit = XFRM_INF; xp->lft.hard_packet_limit = XFRM_INF; + xp->family = family; xp->xfrm_nr = 0; if (pol->sadb_x_policy_type == IPSEC_POLICY_IPSEC && From davem@redhat.com Mon Mar 3 04:37:35 2003 Received: with ECARTIS (v1.0.0; list netdev); Mon, 03 Mar 2003 04:37:38 -0800 (PST) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id h23CbYeA007481 for ; Mon, 3 Mar 2003 04:37:35 -0800 Received: from localhost (IDENT:davem@localhost.localdomain [127.0.0.1]) by pizda.ninka.net (8.9.3/8.9.3) with ESMTP id EAA01567; Mon, 3 Mar 2003 04:19:50 -0800 Date: Mon, 03 Mar 2003 04:19:50 -0800 (PST) Message-Id: <20030303.041950.14411363.davem@redhat.com> To: jmorris@intercode.com.au Cc: toml@us.ibm.com, netdev@oss.sgi.com, kuznet@ms2.inr.ac.ru Subject: Re: [PATCH] Re: IPSec: setkey -DP freezes machine From: "David S. Miller" In-Reply-To: References: <20030303.013451.20307573.davem@redhat.com> X-FalunGong: Information control. X-Mailer: Mew version 2.1 on Emacs 21.1 / Mule 5.0 (SAKAKI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 1837 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev Content-Length: 530 Lines: 16 From: James Morris Date: Mon, 3 Mar 2003 23:13:55 +1100 (EST) On Mon, 3 Mar 2003, David S. Miller wrote: > Alternatively, a family parameter could be added to the compile_policy() > operation, but this duplicates data already present in our native > xfrm_userpolicy_info format. > > I like this solution, it seems the cleanest. Ok, here's a patch which does this. Looks good, I'll apply this. If more problems are found, we can patch on top of this. From terje.eggestad@scali.com Mon Mar 3 04:51:28 2003 Received: with ECARTIS (v1.0.0; list netdev); Mon, 03 Mar 2003 04:51:31 -0800 (PST) Received: from elin.scali.no (elin.scali.no [62.70.89.10]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id h23CpQeA007923 for ; Mon, 3 Mar 2003 04:51:28 -0800 Received: from pc-16.office.scali.no (pc-16.office.scali.no [172.16.0.116]) by elin.scali.no (8.12.5/8.12.5) with ESMTP id h23CpH4l024526; Mon, 3 Mar 2003 13:51:17 +0100 Subject: Re: anyone ever done multicast AF_UNIX sockets? From: Terje Eggestad To: Chris Friesen Cc: linux-kernel , netdev@oss.sgi.com, linux-net@vger.kernel.org In-Reply-To: <3E5E7081.6020704@nortelnetworks.com> References: <3E5E7081.6020704@nortelnetworks.com> Content-Type: text/plain Organization: Scali AS Message-Id: <1046695876.7731.78.camel@pc-16.office.scali.no> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.2.1 Date: 03 Mar 2003 13:51:17 +0100 Content-Transfer-Encoding: 7bit X-Virus-Scanned: by amavisd-milter (http://amavis.org/) X-archive-position: 1838 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: terje.eggestad@scali.com Precedence: bulk X-list: netdev Content-Length: 3616 Lines: 87 On a single box you would use a shared memory segment to do this. It has the following advantages: - no syscalls at all - whenever the recipients need to use the info, they access the shm directly (you may need to use a semaphore to enforce consistency, or if you're really pressed on time, spin lock a shm location) There is no need for the recipients to copy the info to private data structs. - there is no need for the recipients to waste cycles on processing an update - you KNOW that all the recipients has "updated" at the same time. That aside, you idea of being notified when the listener (peer) is not there is pretty hopless when it comes to multicasts. Why does it help you to know that there are no recipients contra the wrong number recipients ???? OR asked differently, if you don't have a notion of who the recipients are/should be, why would you care if there are none?????? There are practically no real applications for this feature. If you really want to get to know that a recipient disappeared, use a stream socket to each recipients, and to keep the # of syscalls down, get the aio patch, and do the send to all with a single lio_listio() call. Also: Keep in mind that either you do multicast, or explisit send to all, the data you're sending are copied from you buffer to the dest sockets recv buffers anyway. If you're sending 1k you need somewhere between 250 to 1000 cycles to do the copy, depending on alignment. I've measured the syscall overhead for a write(len=0) to be about 800 cycles on a P3 or athlon, and about 2000 on P4. If you really have enough possible recipients, you should use a shm segment instead. If you have only a few (~10) the overhead is worst case 20000 cycles, or on a 2G P4, 10 microsecs to do a syscall for each. Who cares... TJ On Thu, 2003-02-27 at 21:09, Chris Friesen wrote: > It is fairly common to want to distribute information between a single > sender and multiple receivers on a single box. > > Multicast IP sockets are one possibility, but then you have additional > overhead in the IP stack. > > Unix sockets are more efficient and give notification if the listener is > not present, but the problem then becomes that you must do one syscall > for each listener. > > So, here's my main point--has anyone ever considered the concept of > multicast AF_UNIX sockets? > > The main features would be: > --ability to associate/disassociate a socket with a multicast address > --ability to associate/disassociate with all multicast addresses > (possibly through some kind of raw socket thing, or maybe a simple > wildcard multicast address) > --on process death all sockets owned by that process are disassociated > from any multicast addresses that they were associated with > --on sending a packet to a multicast address and there are no sockets > associated with it, return -1 with errno=ECONNREFUSED > > The association/disassociation could be done using the setsockopt() > calls the same as with udp sockets, everything else would be the same > from a userspace perspective. > > Any thoughts? How hard would this be to put in? > > Chris -- _________________________________________________________________________ Terje Eggestad mailto:terje.eggestad@scali.no Scali Scalable Linux Systems http://www.scali.com Olaf Helsets Vei 6 tel: +47 22 62 89 61 (OFFICE) P.O.Box 150, Oppsal +47 975 31 574 (MOBILE) N-0619 Oslo fax: +47 22 62 89 51 NORWAY _________________________________________________________________________ From davem@redhat.com Mon Mar 3 04:53:49 2003 Received: with ECARTIS (v1.0.0; list netdev); Mon, 03 Mar 2003 04:53:51 -0800 (PST) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id h23CrmeA008291 for ; Mon, 3 Mar 2003 04:53:49 -0800 Received: from localhost (IDENT:davem@localhost.localdomain [127.0.0.1]) by pizda.ninka.net (8.9.3/8.9.3) with ESMTP id EAA01621; Mon, 3 Mar 2003 04:36:00 -0800 Date: Mon, 03 Mar 2003 04:35:59 -0800 (PST) Message-Id: <20030303.043559.19477354.davem@redhat.com> To: terje.eggestad@scali.com Cc: cfriesen@nortelnetworks.com, linux-kernel@vger.kernel.org, netdev@oss.sgi.com, linux-net@vger.kernel.org Subject: Re: anyone ever done multicast AF_UNIX sockets? From: "David S. Miller" In-Reply-To: <1046695876.7731.78.camel@pc-16.office.scali.no> References: <3E5E7081.6020704@nortelnetworks.com> <1046695876.7731.78.camel@pc-16.office.scali.no> X-FalunGong: Information control. X-Mailer: Mew version 2.1 on Emacs 21.1 / Mule 5.0 (SAKAKI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 1839 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev Content-Length: 212 Lines: 6 From: Terje Eggestad Date: 03 Mar 2003 13:51:17 +0100 On a single box you would use a shared memory segment to do this. Thank you for applying real brains to this problem :) From toml@us.ibm.com Mon Mar 3 07:39:03 2003 Received: with ECARTIS (v1.0.0; list netdev); Mon, 03 Mar 2003 07:39:07 -0800 (PST) Received: from e1.ny.us.ibm.com (e1.ny.us.ibm.com [32.97.182.101]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id h23Fd2eA010972 for ; Mon, 3 Mar 2003 07:39:03 -0800 Received: from northrelay02.pok.ibm.com (northrelay02.pok.ibm.com [9.56.224.150]) by e1.ny.us.ibm.com (8.12.7/8.12.2) with ESMTP id h23Fbeab087742; Mon, 3 Mar 2003 10:37:40 -0500 Received: from d01ml072.pok.ibm.com (d01ml072.pok.ibm.com [9.117.250.211]) by northrelay02.pok.ibm.com (8.12.3/NCO/VER6.5) with ESMTP id h23FbbjS019300; Mon, 3 Mar 2003 10:37:38 -0500 Subject: Re: [PATCH] Re: IPSec: setkey -DP freezes machine To: James Morris Cc: "David S. Miller" , kuznet@ms2.inr.ac.ru, netdev@oss.sgi.com X-Mailer: Lotus Notes Release 5.0.11 July 24, 2002 Message-ID: From: "Tom Lendacky" Date: Mon, 3 Mar 2003 09:37:37 -0600 X-MIMETrack: Serialize by Router on D01ML072/01/M/IBM(Release 5.0.11 +SPRs MIAS5EXFG4, MIAS5AUFPV and DHAG4Y6R7W, MATTEST |November 8th, 2002) at 03/03/2003 10:37:39 AM MIME-Version: 1.0 Content-type: text/plain; charset=us-ascii X-archive-position: 1840 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: toml@us.ibm.com Precedence: bulk X-list: netdev Content-Length: 384 Lines: 17 > Ok, here's a patch which does this. > > I've also added check to verify_newpolicy_info() so that we don't run into > the same problem for policies provided via the netlink interface. > > Tom, would you let me know if this works for you, as my racoon isn't > working yet. The patch works for me, setkey -DP no longer freezes the machine and the proper output is displayed. Tom From davem@redhat.com Mon Mar 3 07:42:07 2003 Received: with ECARTIS (v1.0.0; list netdev); Mon, 03 Mar 2003 07:42:10 -0800 (PST) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id h23Fg6eA011343 for ; Mon, 3 Mar 2003 07:42:07 -0800 Received: from localhost (IDENT:davem@localhost.localdomain [127.0.0.1]) by pizda.ninka.net (8.9.3/8.9.3) with ESMTP id HAA02317; Mon, 3 Mar 2003 07:23:46 -0800 Date: Mon, 03 Mar 2003 07:23:45 -0800 (PST) Message-Id: <20030303.072345.99136696.davem@redhat.com> To: toml@us.ibm.com Cc: jmorris@intercode.com.au, kuznet@ms2.inr.ac.ru, netdev@oss.sgi.com Subject: Re: [PATCH] Re: IPSec: setkey -DP freezes machine From: "David S. Miller" In-Reply-To: References: X-FalunGong: Information control. X-Mailer: Mew version 2.1 on Emacs 21.1 / Mule 5.0 (SAKAKI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 1841 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev Content-Length: 310 Lines: 10 From: "Tom Lendacky" Date: Mon, 3 Mar 2003 09:37:37 -0600 > Tom, would you let me know if this works for you, as my racoon isn't > working yet. The patch works for me, setkey -DP no longer freezes the machine and the proper output is displayed. Thank you for testing. From cfriesen@nortelnetworks.com Mon Mar 3 09:09:48 2003 Received: with ECARTIS (v1.0.0; list netdev); Mon, 03 Mar 2003 09:09:57 -0800 (PST) Received: from zcars04e.nortelnetworks.com (zcars04e.nortelnetworks.com [47.129.242.56]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id h23H9keA012527 for ; Mon, 3 Mar 2003 09:09:48 -0800 Received: from zcard307.ca.nortel.com (zcard307.ca.nortel.com [47.129.242.67]) by zcars04e.nortelnetworks.com (Switch-2.2.0/Switch-2.2.0) with ESMTP id h23H9cR08332; Mon, 3 Mar 2003 12:09:39 -0500 (EST) Received: from zcard0k6.ca.nortel.com ([47.129.242.158]) by zcard307.ca.nortel.com with SMTP (Microsoft Exchange Internet Mail Service Version 5.5.2653.13) id GDFA8BZR; Mon, 3 Mar 2003 12:09:39 -0500 Received: from pcard0ks.ca.nortel.com ([47.129.117.131]) by zcard0k6.ca.nortel.com with SMTP (Microsoft Exchange Internet Mail Service Version 5.5.2653.13) id FSL7YYNG; Mon, 3 Mar 2003 12:09:38 -0500 Received: from nortelnetworks.com (localhost.localdomain [127.0.0.1]) by pcard0ks.ca.nortel.com (Postfix) with ESMTP id 1DDAA2E12F; Mon, 3 Mar 2003 12:09:38 -0500 (EST) Message-ID: <3E638C51.2000904@nortelnetworks.com> Date: Mon, 03 Mar 2003 12:09:37 -0500 X-Sybari-Space: 00000000 00000000 00000000 From: Chris Friesen User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:0.9.8) Gecko/20020204 X-Accept-Language: en-us MIME-Version: 1.0 To: Terje Eggestad Cc: linux-kernel , netdev@oss.sgi.com, linux-net@vger.kernel.org, davem@redhat.com Subject: Re: anyone ever done multicast AF_UNIX sockets? References: <3E5E7081.6020704@nortelnetworks.com> <1046695876.7731.78.camel@pc-16.office.scali.no> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 1842 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: cfriesen@nortelnetworks.com Precedence: bulk X-list: netdev Content-Length: 2790 Lines: 58 Terje Eggestad wrote: > On a single box you would use a shared memory segment to do this. It has > the following advantages: > - no syscalls at all Unless you poll for messages on the receiving side, how do you trigger the receiver to look for a message? Shared memory doesn't have file descriptors. > - whenever the recipients need to use the info, they access the shm > directly (you may need to use a semaphore to enforce consistency, or if > you're really pressed on time, spin lock a shm location) There is no > need for the recipients to copy the info to private data structs. How do they know the information has changed? Suppose one process detects that the ethernet link has dropped. How does it alert other processes which need to do something? > Why does it help you to know that there are no recipients contra the > wrong number recipients ???? OR asked differently, if you don't have a > notion of who the recipients are/should be, why would you care if there > are none?????? > There are practically no real applications for this feature. It's true that if I have a nonzero number of listeners it doesn't tell me anything since I don't know if the right one is included. However, if I send a message and there were *no* listeners but I know that there should be at least one, then I can log the anomaly, raise an alarm, or take whatever action is appropriate. > Also: Keep in mind that either you do multicast, or explisit send to > all, the data you're sending are copied from you buffer to the dest > sockets recv buffers anyway. If you're sending 1k you need somewhere > between 250 to 1000 cycles to do the copy, depending on alignment. I've > measured the syscall overhead for a write(len=0) to be about 800 cycles > on a P3 or athlon, and about 2000 on P4. If you really have enough > possible recipients, you should use a shm segment instead. If you have > only a few (~10) the overhead is worst case 20000 cycles, or on a 2G P4, > 10 microsecs to do a syscall for each. Who cares... Granted, shared memory (or sysV message queues) are the fastest way to transfer data between processes. However, you still have to implement some way to alert the receiver that there is a message waiting for it. For large packet sizes it may be sufficient to send a small unix socket message to alert it that there is a message waiting, but for small messages the cost of the copying is small compared to the cost of the context switch, and the unix multicast cuts the number of context switches in half. Chris -- Chris Friesen | MailStop: 043/33/F10 Nortel Networks | work: (613) 765-0557 3500 Carling Avenue | fax: (613) 765-2986 Nepean, ON K2H 8E9 Canada | email: cfriesen@nortelnetworks.com From davem@redhat.com Mon Mar 3 09:13:03 2003 Received: with ECARTIS (v1.0.0; list netdev); Mon, 03 Mar 2003 09:13:08 -0800 (PST) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id h23HD3eA012917 for ; Mon, 3 Mar 2003 09:13:03 -0800 Received: from localhost (IDENT:davem@localhost.localdomain [127.0.0.1]) by pizda.ninka.net (8.9.3/8.9.3) with ESMTP id IAA02602; Mon, 3 Mar 2003 08:55:04 -0800 Date: Mon, 03 Mar 2003 08:55:04 -0800 (PST) Message-Id: <20030303.085504.105424448.davem@redhat.com> To: cfriesen@nortelnetworks.com Cc: terje.eggestad@scali.com, linux-kernel@vger.kernel.org, netdev@oss.sgi.com, linux-net@vger.kernel.org Subject: Re: anyone ever done multicast AF_UNIX sockets? From: "David S. Miller" In-Reply-To: <3E638C51.2000904@nortelnetworks.com> References: <3E5E7081.6020704@nortelnetworks.com> <1046695876.7731.78.camel@pc-16.office.scali.no> <3E638C51.2000904@nortelnetworks.com> X-FalunGong: Information control. X-Mailer: Mew version 2.1 on Emacs 21.1 / Mule 5.0 (SAKAKI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 1843 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev Content-Length: 254 Lines: 8 From: Chris Friesen Date: Mon, 03 Mar 2003 12:09:37 -0500 Unless you poll for messages on the receiving side, how do you trigger the receiver to look for a message? Send signals. Use a FUTEX, be creative... From cfriesen@nortelnetworks.com Mon Mar 3 10:03:26 2003 Received: with ECARTIS (v1.0.0; list netdev); Mon, 03 Mar 2003 10:03:32 -0800 (PST) Received: from zcars04e.nortelnetworks.com (zcars04e.nortelnetworks.com [47.129.242.56]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id h23I3PeA015592 for ; Mon, 3 Mar 2003 10:03:25 -0800 Received: from zcard307.ca.nortel.com (zcard307.ca.nortel.com [47.129.242.67]) by zcars04e.nortelnetworks.com (Switch-2.2.0/Switch-2.2.0) with ESMTP id h23I2jR09110; Mon, 3 Mar 2003 13:02:45 -0500 (EST) Received: from zcard0k6.ca.nortel.com ([47.129.242.158]) by zcard307.ca.nortel.com with SMTP (Microsoft Exchange Internet Mail Service Version 5.5.2653.13) id GDFA8HJT; Mon, 3 Mar 2003 13:02:45 -0500 Received: from pcard0ks.ca.nortel.com ([47.129.117.131]) by zcard0k6.ca.nortel.com with SMTP (Microsoft Exchange Internet Mail Service Version 5.5.2653.13) id FSL7YYTZ; Mon, 3 Mar 2003 13:02:45 -0500 Received: from nortelnetworks.com (localhost.localdomain [127.0.0.1]) by pcard0ks.ca.nortel.com (Postfix) with ESMTP id 978492E12F; Mon, 3 Mar 2003 13:02:44 -0500 (EST) Message-ID: <3E6398C4.2020605@nortelnetworks.com> Date: Mon, 03 Mar 2003 13:02:44 -0500 X-Sybari-Space: 00000000 00000000 00000000 From: Chris Friesen User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:0.9.8) Gecko/20020204 X-Accept-Language: en-us MIME-Version: 1.0 To: jamal Cc: linux-kernel@vger.kernel.org, netdev@oss.sgi.com, linux-net@vger.kernel.org Subject: Re: anyone ever done multicast AF_UNIX sockets? References: <3E5E7081.6020704@nortelnetworks.com> <20030228083009.Y53276@shell.cyberus.ca> <3E5F748E.2080605@nortelnetworks.com> <20030228212309.C57212@shell.cyberus.ca> <3E619E97.8010508@nortelnetworks.com> <20030302081916.S61365@shell.cyberus.ca> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 1844 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: cfriesen@nortelnetworks.com Precedence: bulk X-list: netdev Content-Length: 4146 Lines: 103 jamal wrote: > On Sun, 2 Mar 2003, Chris Friesen wrote >>jamal wrote >>>Did you also measure throughput >>No. lmbench doesn't appear to test UDP socket local throughput > I think you need to collect all data if you are trying to show > improvements. I'll look at how they were measuring unix socket throughput and try implementing something similar for UDP. It's not clear to me how to really measure throughput in a multicast environment though since it depends very much on your application messaging patterns. > Ok, so its only a problem when you have a few listeners i.e user space > scheme scales just fine as you keep adding listeners. > In your tests what was the break-even point? See below for more detailed test results. > Addressing has to be backwared compatible i.e not affecting any other > program. Of course. The way I've designed it is that you get and bind() a socket as normal, and then use setsockopt() to register interest in a multicast address (same as IP multicast). If the address already exists but is not a multicast address, then you get an error. If a socket tries to bind() or connect() to an existing multicast address, you get an error. The different types of addresses exist in the same address space, but the only way to register interest in multicast addresses is through setsockopt(). >>The timings (in usec) for the delays to each of the listeners were as >>follows on my duron 750: >> >>userspace server: 104 133 153 >>userspace no server: 72 111 138 >>kernelspace: 60 91 113 > Actually, the difference between user space server and kernel doesnt > appear that big. What you need to do is collect more data. > repeat with incrementing number of listeners. What would you consider a "big" difference? Here the userspace server is 35% slower than the kernelspace version. You wanted more data, so here's results comparing the no-server userspace method vs the kernel method. The server-based one would be slightly more expensive than the no-server version. The results below are the smallest and largest latencies (in usecs) for the message to reach the listeners in userspace. I've used three different sizes, the two extremes and a roughly average sized message in my particular domain. 44bytes # listeners userspace kernelspace 10 73,335 103,252 20 72,610 106,429 50 74,1482 205,1301 100 76,3000 362,3425 200 737,9917 236bytes # listeners userspace kernelspace 10 70,346 81,265 20 74,639 122,468 50 75,1557 230,1421 100 80,3107 408,3743 40036-byte message # listeners userspace kernelspace 10 302,4181 322,1692 20 303,7491 347,3450 50 306,10451 483,8394 100 309,23107 697,17061 200 313,45528 997,39810 As one would expect, the initial latencies are somewhat higher for the kernel space solution since all the skb header duplication is done before anyone is woken up. One thing that I did not expect was the increased max latency in the kernel space soltion when the number of listeners grew large. On reflection, however, I suspect that this is due to scheduler load since all of the listening processes have become runnable while in the userspace version they become runnable one at a time. It would be interesting to run this on 2.5 with the O(1) scheduler and see if it makes a difference. With larger message sizes, the cost of the additional copies in the userspace solution start to outweigh the overhead of the additional runnable processes and the kernel space solution stays faster in all runs tested. Chris -- Chris Friesen | MailStop: 043/33/F10 Nortel Networks | work: (613) 765-0557 3500 Carling Avenue | fax: (613) 765-2986 Nepean, ON K2H 8E9 Canada | email: cfriesen@nortelnetworks.com From cfriesen@nortelnetworks.com Mon Mar 3 10:07:57 2003 Received: with ECARTIS (v1.0.0; list netdev); Mon, 03 Mar 2003 10:08:02 -0800 (PST) Received: from zcars04e.nortelnetworks.com (zcars04e.nortelnetworks.com [47.129.242.56]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id h23I7ueA016947 for ; Mon, 3 Mar 2003 10:07:56 -0800 Received: from zcard307.ca.nortel.com (zcard307.ca.nortel.com [47.129.242.67]) by zcars04e.nortelnetworks.com (Switch-2.2.0/Switch-2.2.0) with ESMTP id h23I7mR09130; Mon, 3 Mar 2003 13:07:48 -0500 (EST) Received: from zcard0k6.ca.nortel.com ([47.129.242.158]) by zcard307.ca.nortel.com with SMTP (Microsoft Exchange Internet Mail Service Version 5.5.2653.13) id GDFA8H5L; Mon, 3 Mar 2003 13:07:47 -0500 Received: from pcard0ks.ca.nortel.com ([47.129.117.131]) by zcard0k6.ca.nortel.com with SMTP (Microsoft Exchange Internet Mail Service Version 5.5.2653.13) id FSL7YY44; Mon, 3 Mar 2003 13:07:48 -0500 Received: from nortelnetworks.com (localhost.localdomain [127.0.0.1]) by pcard0ks.ca.nortel.com (Postfix) with ESMTP id 74D0A2E12F; Mon, 3 Mar 2003 13:07:45 -0500 (EST) Message-ID: <3E6399F1.10303@nortelnetworks.com> Date: Mon, 03 Mar 2003 13:07:45 -0500 X-Sybari-Space: 00000000 00000000 00000000 From: Chris Friesen User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:0.9.8) Gecko/20020204 X-Accept-Language: en-us MIME-Version: 1.0 To: "David S. Miller" Cc: terje.eggestad@scali.com, linux-kernel@vger.kernel.org, netdev@oss.sgi.com, linux-net@vger.kernel.org Subject: Re: anyone ever done multicast AF_UNIX sockets? References: <3E5E7081.6020704@nortelnetworks.com> <1046695876.7731.78.camel@pc-16.office.scali.no> <3E638C51.2000904@nortelnetworks.com> <20030303.085504.105424448.davem@redhat.com> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 1845 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: cfriesen@nortelnetworks.com Precedence: bulk X-list: netdev Content-Length: 793 Lines: 26 David S. Miller wrote: > From: Chris Friesen > Date: Mon, 03 Mar 2003 12:09:37 -0500 > > Unless you poll for messages on the receiving side, how do you trigger > the receiver to look for a message? > > Send signals. Use a FUTEX, be creative... Suppose I have a process that waits on UDP packets, the unified local IPC that we're discussing, other unix sockets, and stdin. It's awfully nice if the local IPC can be handled using the same select/poll mechanism as all the other messaging. Chris -- Chris Friesen | MailStop: 043/33/F10 Nortel Networks | work: (613) 765-0557 3500 Carling Avenue | fax: (613) 765-2986 Nepean, ON K2H 8E9 Canada | email: cfriesen@nortelnetworks.com From davem@redhat.com Mon Mar 3 10:14:33 2003 Received: with ECARTIS (v1.0.0; list netdev); Mon, 03 Mar 2003 10:14:38 -0800 (PST) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id h23IEWeA017421 for ; Mon, 3 Mar 2003 10:14:33 -0800 Received: from localhost (IDENT:davem@localhost.localdomain [127.0.0.1]) by pizda.ninka.net (8.9.3/8.9.3) with ESMTP id JAA03149; Mon, 3 Mar 2003 09:56:41 -0800 Date: Mon, 03 Mar 2003 09:56:41 -0800 (PST) Message-Id: <20030303.095641.87696857.davem@redhat.com> To: cfriesen@nortelnetworks.com Cc: terje.eggestad@scali.com, linux-kernel@vger.kernel.org, netdev@oss.sgi.com, linux-net@vger.kernel.org Subject: Re: anyone ever done multicast AF_UNIX sockets? From: "David S. Miller" In-Reply-To: <3E6399F1.10303@nortelnetworks.com> References: <3E638C51.2000904@nortelnetworks.com> <20030303.085504.105424448.davem@redhat.com> <3E6399F1.10303@nortelnetworks.com> X-FalunGong: Information control. X-Mailer: Mew version 2.1 on Emacs 21.1 / Mule 5.0 (SAKAKI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 1846 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev Content-Length: 860 Lines: 18 From: Chris Friesen Date: Mon, 03 Mar 2003 13:07:45 -0500 Suppose I have a process that waits on UDP packets, the unified local IPC that we're discussing, other unix sockets, and stdin. It's awfully nice if the local IPC can be handled using the same select/poll mechanism as all the other messaging. So use UDP, you still haven't backed up your performance claims. Experiment, set the SO_NO_CHECK socket option to "1" and see if that makes a difference performance wise for local clients. But if performance is "so important", then you shouldn't really be shying away from the shared memory suggestion and nothing is going to top that (it eliminates all the copies, using flat out AF_UNIX over UDP only truly eliminates some header processing, nothing more, the copies are still there with AF_UNIX). From ak@suse.de Mon Mar 3 10:18:40 2003 Received: with ECARTIS (v1.0.0; list netdev); Mon, 03 Mar 2003 10:18:46 -0800 (PST) Received: from Cantor.suse.de (ns.suse.de [213.95.15.193]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id h23IIdeA017787 for ; Mon, 3 Mar 2003 10:18:40 -0800 Received: from Hermes.suse.de (Hermes.suse.de [213.95.15.136]) by Cantor.suse.de (Postfix) with ESMTP id CA51114EAB; Mon, 3 Mar 2003 19:18:07 +0100 (MET) To: Chris Friesen Cc: linux-kernel@vger.kernel.org, netdev@oss.sgi.com, linux-net@vger.kernel.org, hadi@cyberus.ca Subject: Re: anyone ever done multicast AF_UNIX sockets? References: <3E5E7081.6020704@nortelnetworks.com.suse.lists.linux.kernel> <20030228083009.Y53276@shell.cyberus.ca.suse.lists.linux.kernel> <3E5F748E.2080605@nortelnetworks.com.suse.lists.linux.kernel> <20030228212309.C57212@shell.cyberus.ca.suse.lists.linux.kernel> <3E619E97.8010508@nortelnetworks.com.suse.lists.linux.kernel> <20030302081916.S61365@shell.cyberus.ca.suse.lists.linux.kernel> <3E6398C4.2020605@nortelnetworks.com.suse.lists.linux.kernel> From: Andi Kleen Date: 03 Mar 2003 19:18:07 +0100 In-Reply-To: Chris Friesen's message of "3 Mar 2003 19:07:27 +0100" Message-ID: X-Mailer: Gnus v5.7/Emacs 20.7 X-archive-position: 1847 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: ak@suse.de Precedence: bulk X-list: netdev Content-Length: 633 Lines: 17 Chris Friesen writes: > I'll look at how they were measuring unix socket throughput and try > implementing something similar for UDP. It's not clear to me how to > really measure throughput in a multicast environment though since it > depends very much on your application messaging patterns. Unix sockets are often slower than TCP over loopback because they use much smaller socket sizes by default. This causes much more context switches. Just run a vmstat 1 in parallel and watch the context switch rates. You can fix it by increasing the send and receive buffers of the unix socket. -Andi From cfriesen@nortelnetworks.com Mon Mar 3 11:11:17 2003 Received: with ECARTIS (v1.0.0; list netdev); Mon, 03 Mar 2003 11:11:24 -0800 (PST) Received: from zcars04e.nortelnetworks.com (zcars04e.nortelnetworks.com [47.129.242.56]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id h23JBGeA018646 for ; Mon, 3 Mar 2003 11:11:17 -0800 Received: from zcard307.ca.nortel.com (zcard307.ca.nortel.com [47.129.242.67]) by zcars04e.nortelnetworks.com (Switch-2.2.0/Switch-2.2.0) with ESMTP id h23JB8R10116; Mon, 3 Mar 2003 14:11:08 -0500 (EST) Received: from zcard0k6.ca.nortel.com ([47.129.242.158]) by zcard307.ca.nortel.com with SMTP (Microsoft Exchange Internet Mail Service Version 5.5.2653.13) id GDFA8P5D; Mon, 3 Mar 2003 14:11:08 -0500 Received: from pcard0ks.ca.nortel.com ([47.129.117.131]) by zcard0k6.ca.nortel.com with SMTP (Microsoft Exchange Internet Mail Service Version 5.5.2653.13) id FSL7YY8H; Mon, 3 Mar 2003 14:11:08 -0500 Received: from nortelnetworks.com (localhost.localdomain [127.0.0.1]) by pcard0ks.ca.nortel.com (Postfix) with ESMTP id DA6CE2E12F; Mon, 3 Mar 2003 14:11:07 -0500 (EST) Message-ID: <3E63A8CB.2090307@nortelnetworks.com> Date: Mon, 03 Mar 2003 14:11:07 -0500 X-Sybari-Space: 00000000 00000000 00000000 From: Chris Friesen User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:0.9.8) Gecko/20020204 X-Accept-Language: en-us MIME-Version: 1.0 To: "David S. Miller" Cc: terje.eggestad@scali.com, linux-kernel@vger.kernel.org, netdev@oss.sgi.com, linux-net@vger.kernel.org Subject: Re: anyone ever done multicast AF_UNIX sockets? References: <3E638C51.2000904@nortelnetworks.com> <20030303.085504.105424448.davem@redhat.com> <3E6399F1.10303@nortelnetworks.com> <20030303.095641.87696857.davem@redhat.com> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 1848 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: cfriesen@nortelnetworks.com Precedence: bulk X-list: netdev Content-Length: 1914 Lines: 47 David S. Miller wrote: > From: Chris Friesen > Date: Mon, 03 Mar 2003 13:07:45 -0500 > > Suppose I have a process that waits on UDP packets, the unified local > IPC that we're discussing, other unix sockets, and stdin. It's awfully > nice if the local IPC can be handled using the same select/poll > mechanism as all the other messaging. > > So use UDP, you still haven't backed up your performance > claims. Experiment, set the SO_NO_CHECK socket option to > "1" and see if that makes a difference performance wise > for local clients. I did provide numbers for UDP latency, which is more critical for my own application since most messages fit within a single packet. I haven't done UDP bandwidth testing--I need to check how lmbench did it for the unix socket and do the same for UDP. Local TCP was far slower than unix sockets though. > But if performance is "so important", then you shouldn't really be > shying away from the shared memory suggestion and nothing is going to > top that (it eliminates all the copies, using flat out AF_UNIX over > UDP only truly eliminates some header processing, nothing more, the > copies are still there with AF_UNIX). Yes, I realize that the receiver still has to do a copy. With large messages this could be an issue. With small messages, I had assumed that the cost of a recv() wouldn't be that much worse than the cost of the sender doing a kill() to alert the receiver that a message is waiting. Maybe I was wrong. It might be interesting to try a combination of sysV msg queue and signals to see how it stacks up. Project for tonight. Chris -- Chris Friesen | MailStop: 043/33/F10 Nortel Networks | work: (613) 765-0557 3500 Carling Avenue | fax: (613) 765-2986 Nepean, ON K2H 8E9 Canada | email: cfriesen@nortelnetworks.com From davem@redhat.com Mon Mar 3 11:14:38 2003 Received: with ECARTIS (v1.0.0; list netdev); Mon, 03 Mar 2003 11:14:42 -0800 (PST) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id h23JEceA019020 for ; Mon, 3 Mar 2003 11:14:38 -0800 Received: from localhost (IDENT:davem@localhost.localdomain [127.0.0.1]) by pizda.ninka.net (8.9.3/8.9.3) with ESMTP id KAA03434; Mon, 3 Mar 2003 10:56:46 -0800 Date: Mon, 03 Mar 2003 10:56:46 -0800 (PST) Message-Id: <20030303.105646.02089773.davem@redhat.com> To: cfriesen@nortelnetworks.com Cc: terje.eggestad@scali.com, linux-kernel@vger.kernel.org, netdev@oss.sgi.com, linux-net@vger.kernel.org Subject: Re: anyone ever done multicast AF_UNIX sockets? From: "David S. Miller" In-Reply-To: <3E63A8CB.2090307@nortelnetworks.com> References: <3E6399F1.10303@nortelnetworks.com> <20030303.095641.87696857.davem@redhat.com> <3E63A8CB.2090307@nortelnetworks.com> X-FalunGong: Information control. X-Mailer: Mew version 2.1 on Emacs 21.1 / Mule 5.0 (SAKAKI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 1849 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev Content-Length: 486 Lines: 12 From: Chris Friesen Date: Mon, 03 Mar 2003 14:11:07 -0500 I haven't done UDP bandwidth testing--I need to check how lmbench did it for the unix socket and do the same for UDP. Local TCP was far slower than unix sockets though. That result is system specific and depends upon how the data and datastructures hit the cpu cachelines in the kernel. TCP bandwidth is slightly faster than AF_UNIX bandwidth on my sparc64 boxes for example. From terje.eggestad@scali.com Mon Mar 3 11:35:48 2003 Received: with ECARTIS (v1.0.0; list netdev); Mon, 03 Mar 2003 11:35:55 -0800 (PST) Received: from localhost.localdomain (2etnv5.cm.chello.no [80.111.51.24]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id h23JZkeA020080 for ; Mon, 3 Mar 2003 11:35:47 -0800 Received: from localhost (localhost [127.0.0.1]) by localhost.localdomain (8.12.5/8.12.5) with ESMTP id h23JdK2c028159; Mon, 3 Mar 2003 20:39:20 +0100 Subject: Re: anyone ever done multicast AF_UNIX sockets? From: Terje Eggestad To: Chris Friesen Cc: linux-kernel , netdev@oss.sgi.com, linux-net@vger.kernel.org, davem@redhat.com In-Reply-To: <3E638C51.2000904@nortelnetworks.com> References: <3E5E7081.6020704@nortelnetworks.com> <1046695876.7731.78.camel@pc-16.office.scali.no> <3E638C51.2000904@nortelnetworks.com> Content-Type: text/plain Content-Transfer-Encoding: 7bit X-Mailer: Ximian Evolution 1.0.8 (1.0.8-10) Date: 03 Mar 2003 20:39:19 +0100 Message-Id: <1046720360.28127.209.camel@eggis1> Mime-Version: 1.0 X-archive-position: 1850 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: terje.eggestad@scali.com Precedence: bulk X-list: netdev Content-Length: 4779 Lines: 99 On Mon, 2003-03-03 at 18:09, Chris Friesen wrote: Terje Eggestad wrote: > On a single box you would use a shared memory segment to do this. It has > the following advantages: > - no syscalls at all Unless you poll for messages on the receiving side, how do you trigger the receiver to look for a message? Shared memory doesn't have file descriptors. OK, you want multicast to send the *same* info to all peers. The only of two sane reason to do that is to update the peers with some info they need to do real work. So when there is reel work to be done, the info is available in the shm. The other reason is to tell the others to die. Then you a) have a socket/pipe connected that you get a end of file event on, or, you have a timeout on the select() (in any real life app you should anyway) so that when select/poll return -1 with errno=EINTR, you check some flags in shm. If you *had* multicast, you don't know *when* a peer proccessed it. What if the peer is suspended ??? you don't get an error on the send, and you apparently never get an answer, then what? The peer may also gone haywire on a while(1); I have an OSS project project (http://midway.sourceforge.net/) where I have a gateway daemon that poll on a large set of sockets (TCP/IP clients) and passes the request to IPC servers, and back. The way I'm doing that is to have two threads, on on blocking wait on the select/poll, the other on msgrcv. Works quite well. > - whenever the recipients need to use the info, they access the shm > directly (you may need to use a semaphore to enforce consistency, or if > you're really pressed on time, spin lock a shm location) There is no > need for the recipients to copy the info to private data structs. How do they know the information has changed? Suppose one process detects that the ethernet link has dropped. How does it alert other processes which need to do something? Again, if you want someone to do something, they must ack the request before you can safely assume that they are going to do something. > Why does it help you to know that there are no recipients contra the > wrong number recipients ???? OR asked differently, if you don't have a > notion of who the recipients are/should be, why would you care if there > are none?????? > There are practically no real applications for this feature. It's true that if I have a nonzero number of listeners it doesn't tell me anything since I don't know if the right one is included. However, if I send a message and there were *no* listeners but I know that there should be at least one, then I can log the anomaly, raise an alarm, or take whatever action is appropriate. > Also: Keep in mind that either you do multicast, or explisit send to > all, the data you're sending are copied from you buffer to the dest > sockets recv buffers anyway. If you're sending 1k you need somewhere > between 250 to 1000 cycles to do the copy, depending on alignment. I've > measured the syscall overhead for a write(len=0) to be about 800 cycles > on a P3 or athlon, and about 2000 on P4. If you really have enough > possible recipients, you should use a shm segment instead. If you have > only a few (~10) the overhead is worst case 20000 cycles, or on a 2G P4, > 10 microsecs to do a syscall for each. Who cares... Granted, shared memory (or sysV message queues) are the fastest way to transfer data between processes. However, you still have to implement some way to alert the receiver that there is a message waiting for it. For large packet sizes it may be sufficient to send a small unix socket message to alert it that there is a message waiting, but for small messages the cost of the copying is small compared to the cost of the context switch, and the unix multicast cuts the number of context switches in half. Chris -- Chris Friesen | MailStop: 043/33/F10 Nortel Networks | work: (613) 765-0557 3500 Carling Avenue | fax: (613) 765-2986 Nepean, ON K2H 8E9 Canada | email: cfriesen@nortelnetworks.com -- _________________________________________________________________________ Terje Eggestad mailto:terje.eggestad@scali.no Scali Scalable Linux Systems http://www.scali.com Olaf Helsets Vei 6 tel: +47 22 62 89 61 (OFFICE) P.O.Box 150, Oppsal +47 975 31 574 (MOBILE) N-0619 Oslo fax: +47 22 62 89 51 NORWAY _________________________________________________________________________ From terje.eggestad@scali.com Mon Mar 3 11:38:38 2003 Received: with ECARTIS (v1.0.0; list netdev); Mon, 03 Mar 2003 11:38:45 -0800 (PST) Received: from localhost.localdomain (2etnv5.cm.chello.no [80.111.51.24]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id h23JcaeA020456 for ; Mon, 3 Mar 2003 11:38:37 -0800 Received: from localhost (localhost [127.0.0.1]) by localhost.localdomain (8.12.5/8.12.5) with ESMTP id h23JgC2c028173; Mon, 3 Mar 2003 20:42:12 +0100 Subject: Re: anyone ever done multicast AF_UNIX sockets? From: Terje Eggestad To: "David S. Miller" Cc: cfriesen@nortelnetworks.com, linux-kernel@vger.kernel.org, netdev@oss.sgi.com, linux-net@vger.kernel.org In-Reply-To: <20030303.105646.02089773.davem@redhat.com> References: <3E6399F1.10303@nortelnetworks.com> <20030303.095641.87696857.davem@redhat.com> <3E63A8CB.2090307@nortelnetworks.com> <20030303.105646.02089773.davem@redhat.com> Content-Type: text/plain Content-Transfer-Encoding: 7bit X-Mailer: Ximian Evolution 1.0.8 (1.0.8-10) Date: 03 Mar 2003 20:42:12 +0100 Message-Id: <1046720532.28127.213.camel@eggis1> Mime-Version: 1.0 X-archive-position: 1851 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: terje.eggestad@scali.com Precedence: bulk X-list: netdev Content-Length: 1295 Lines: 32 On Mon, 2003-03-03 at 19:56, David S. Miller wrote: From: Chris Friesen Date: Mon, 03 Mar 2003 14:11:07 -0500 I haven't done UDP bandwidth testing--I need to check how lmbench did it for the unix socket and do the same for UDP. Local TCP was far slower than unix sockets though. That result is system specific and depends upon how the data and datastructures hit the cpu cachelines in the kernel. TCP bandwidth is slightly faster than AF_UNIX bandwidth on my sparc64 boxes for example. I've seen that their are the same on linux.I tried to to do AF_UNIX instead of AF_INET internally to boost perf, but to no avail. Makes you suspect that the loopback device actually create an AF_UNIX connection under the hood ;-) -- _________________________________________________________________________ Terje Eggestad mailto:terje.eggestad@scali.no Scali Scalable Linux Systems http://www.scali.com Olaf Helsets Vei 6 tel: +47 22 62 89 61 (OFFICE) P.O.Box 150, Oppsal +47 975 31 574 (MOBILE) N-0619 Oslo fax: +47 22 62 89 51 NORWAY _________________________________________________________________________ From cfriesen@nortelnetworks.com Mon Mar 3 17:31:24 2003 Received: with ECARTIS (v1.0.0; list netdev); Mon, 03 Mar 2003 17:31:30 -0800 (PST) Received: from zcars04f.nortelnetworks.com (zcars04f.nortelnetworks.com [47.129.242.57]) by oss.sgi.com (8.11.6/8.11.6) with SMTP id h241VNp28668 for ; Mon, 3 Mar 2003 17:31:23 -0800 Received: from zcard309.ca.nortel.com (zcard309.ca.nortel.com [47.129.242.69]) by zcars04f.nortelnetworks.com (Switch-2.2.0/Switch-2.2.0) with ESMTP id h23LWvU00760; Mon, 3 Mar 2003 16:32:57 -0500 (EST) Received: from zcard0k6.ca.nortel.com ([47.129.242.158]) by zcard309.ca.nortel.com with SMTP (Microsoft Exchange Internet Mail Service Version 5.5.2653.13) id GDF4SKVC; Mon, 3 Mar 2003 16:32:57 -0500 Received: from pcard0ks.ca.nortel.com ([47.129.117.131]) by zcard0k6.ca.nortel.com with SMTP (Microsoft Exchange Internet Mail Service Version 5.5.2653.13) id FSL7YZ3Z; Mon, 3 Mar 2003 16:32:57 -0500 Received: from nortelnetworks.com (localhost.localdomain [127.0.0.1]) by pcard0ks.ca.nortel.com (Postfix) with ESMTP id 73CD52E12F; Mon, 3 Mar 2003 16:32:56 -0500 (EST) Message-ID: <3E63CA08.4040209@nortelnetworks.com> Date: Mon, 03 Mar 2003 16:32:56 -0500 X-Sybari-Space: 00000000 00000000 00000000 From: Chris Friesen User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:0.9.8) Gecko/20020204 X-Accept-Language: en-us MIME-Version: 1.0 To: Terje Eggestad Cc: "David S. Miller" , linux-kernel@vger.kernel.org, netdev@oss.sgi.com, linux-net@vger.kernel.org Subject: Re: anyone ever done multicast AF_UNIX sockets? References: <3E6399F1.10303@nortelnetworks.com> <20030303.095641.87696857.davem@redhat.com> <3E63A8CB.2090307@nortelnetworks.com> <20030303.105646.02089773.davem@redhat.com> <1046720532.28127.213.camel@eggis1> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 1852 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: cfriesen@nortelnetworks.com Precedence: bulk X-list: netdev Content-Length: 1774 Lines: 43 Terje Eggestad wrote: > On Mon, 2003-03-03 at 19:56, David S. Miller wrote: > TCP bandwidth is slightly faster than AF_UNIX bandwidth on my > sparc64 boxes for example. > > I've seen that their are the same on linux.I tried to to do AF_UNIX > instead of AF_INET internally to boost perf, but to no avail. Makes you > suspect that the loopback device actually create an AF_UNIX connection > under the hood ;-) On my P4 1.8GHz, AF_INET vs AF_UNIX looks like this: *Local* Communication latencies in microseconds - smaller is better ------------------------------------------------------------- Host OS 2p/0K Pipe AF UDP RPC/ TCP RPC/ TCP ctxsw UNIX UDP TCP conn --------- ------- ----- ----- ---- ----- ----- ----- ----- ---- pcard0ks. 2.4.18- 1.740 10.4 15.9 20.1 33.1 23.5 44.3 72.7 pcard0ks. 2.4.18- 1.560 10.6 16.0 23.4 38.1 36.1 44.6 77.4 *Local* Communication bandwidths in MB/s - bigger is better ----------------------------------------------------------- Host OS Pipe AF TCP File Mmap Bcopy Bcopy Mem Mem UNIX reread reread (libc) (hand) read write --------- ------- ---- ---- ---- ------ ------ ------ ------ ---- ----- pcard0ks. 2.4.18- 650. 677. 151. 721.9 958.0 290.8 288.8 955. 418.4 pcard0ks. 2.4.18- 379. 701. 163. 714.8 949.5 289.5 288.5 956. 420.5 On this machine at least, UDP latency is 25% worse than AF_UNIX, and TCP bandwidth is about 22% that of AF_UNIX. Chris -- Chris Friesen | MailStop: 043/33/F10 Nortel Networks | work: (613) 765-0557 3500 Carling Avenue | fax: (613) 765-2986 Nepean, ON K2H 8E9 Canada | email: cfriesen@nortelnetworks.com From cfriesen@nortelnetworks.com Mon Mar 3 17:35:07 2003 Received: with ECARTIS (v1.0.0; list netdev); Mon, 03 Mar 2003 17:35:11 -0800 (PST) Received: from zcars04f.nortelnetworks.com (zcars04f.nortelnetworks.com [47.129.242.57]) by oss.sgi.com (8.11.6/8.11.6) with SMTP id h241Z7j29263 for ; Mon, 3 Mar 2003 17:35:07 -0800 Received: from zcard309.ca.nortel.com (zcard309.ca.nortel.com [47.129.242.69]) by zcars04f.nortelnetworks.com (Switch-2.2.0/Switch-2.2.0) with ESMTP id h23MTEU04066; Mon, 3 Mar 2003 17:29:15 -0500 (EST) Received: from zcard0k6.ca.nortel.com ([47.129.242.158]) by zcard309.ca.nortel.com with SMTP (Microsoft Exchange Internet Mail Service Version 5.5.2653.13) id GDF4SL41; Mon, 3 Mar 2003 17:29:15 -0500 Received: from pcard0ks.ca.nortel.com ([47.129.117.131]) by zcard0k6.ca.nortel.com with SMTP (Microsoft Exchange Internet Mail Service Version 5.5.2653.13) id FSL7YZVN; Mon, 3 Mar 2003 17:29:15 -0500 Received: from nortelnetworks.com (localhost.localdomain [127.0.0.1]) by pcard0ks.ca.nortel.com (Postfix) with ESMTP id 3CEB72E12F; Mon, 3 Mar 2003 17:29:14 -0500 (EST) Message-ID: <3E63D73A.2000402@nortelnetworks.com> Date: Mon, 03 Mar 2003 17:29:14 -0500 X-Sybari-Space: 00000000 00000000 00000000 From: Chris Friesen User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:0.9.8) Gecko/20020204 X-Accept-Language: en-us MIME-Version: 1.0 To: Terje Eggestad Cc: linux-kernel , netdev@oss.sgi.com, linux-net@vger.kernel.org, davem@redhat.com Subject: Re: anyone ever done multicast AF_UNIX sockets? References: <3E5E7081.6020704@nortelnetworks.com> <1046695876.7731.78.camel@pc-16.office.scali.no> <3E638C51.2000904@nortelnetworks.com> <1046720360.28127.209.camel@eggis1> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 1853 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: cfriesen@nortelnetworks.com Precedence: bulk X-list: netdev Content-Length: 2414 Lines: 54 Terje Eggestad wrote: > On Mon, 2003-03-03 at 18:09, Chris Friesen wrote: > Terje Eggestad wrote: > > On a single box you would use a shared memory segment to do this. It has > > the following advantages: > > - no syscalls at all > > Unless you poll for messages on the receiving side, how do you trigger > the receiver to look for a message? Shared memory doesn't have file > descriptors. > > OK, you want multicast to send the *same* info to all peers. The only of > two sane reason to do that is to update the peers with some info they > need to do real work. So when there is reel work to be done, the info is > available in the shm. Okay, but how do they know there is work to be done? They're waiting in select() monitoring sockets, fds, being hit with signals, etc. How do you tell them to check their messages? You have to hit them over the head with a signal or something and tell them to check the shared memory messages. > If you *had* multicast, you don't know *when* a peer proccessed it. > What if the peer is suspended ??? you don't get an error on the send, > and you apparently never get an answer, then what? The peer may also > gone haywire on a while(1); Exactly. So if the message got delivered you have no way of knowing for sure that it was processed and you have application-level timers and stuff. But if the message wasn't delivered to anyone and you know it should have been, then you don't have to wait for the timer to expire to know that they didn't get it. > How do they know the information has changed? Suppose one process > detects that the ethernet link has dropped. How does it alert other > processes which need to do something? > > Again, if you want someone to do something, they must ack the request > before you can safely assume that they are going to do something. Certainly. My point was that if you're trying to handle all events in a single thread, you need some way to tell the message recipient that it needs to check the shared memory buffer. Otherwise you need multiple threads like you mentioned in your project description. Chris -- Chris Friesen | MailStop: 043/33/F10 Nortel Networks | work: (613) 765-0557 3500 Carling Avenue | fax: (613) 765-2986 Nepean, ON K2H 8E9 Canada | email: cfriesen@nortelnetworks.com From terje.eggestad@scali.com Mon Mar 3 18:14:23 2003 Received: with ECARTIS (v1.0.0; list netdev); Mon, 03 Mar 2003 18:14:27 -0800 (PST) Received: from localhost.localdomain (2etnv5.cm.chello.no [80.111.51.24]) by oss.sgi.com (8.11.6/8.11.6) with SMTP id h242ELf31746 for ; Mon, 3 Mar 2003 18:14:21 -0800 Received: from localhost (localhost [127.0.0.1]) by localhost.localdomain (8.12.5/8.12.5) with ESMTP id h23NTO2c028472; Tue, 4 Mar 2003 00:29:24 +0100 Subject: Re: anyone ever done multicast AF_UNIX sockets? From: Terje Eggestad To: Chris Friesen Cc: linux-kernel , netdev@oss.sgi.com, linux-net@vger.kernel.org, davem@redhat.com In-Reply-To: <3E63D73A.2000402@nortelnetworks.com> References: <3E5E7081.6020704@nortelnetworks.com> <1046695876.7731.78.camel@pc-16.office .scali.no> <3E638C51.2000904@nortelnetworks.com> <1046720360.28127.209.camel@eggis1> <3E63D73A.2000402@nortelnetworks.com> Content-Type: text/plain Content-Transfer-Encoding: 7bit X-Mailer: Ximian Evolution 1.0.8 (1.0.8-10) Date: 04 Mar 2003 00:29:24 +0100 Message-Id: <1046734165.27924.263.camel@eggis1> Mime-Version: 1.0 X-archive-position: 1854 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: terje.eggestad@scali.com Precedence: bulk X-list: netdev Content-Length: 2751 Lines: 82 My point is that you can't send a request real work with either shm nor multicast. You don't know who or howmany recipients there are. You just use it to update someone that do real work. Then they tend not to need it until they get a request for real work, then alost always on a tcp connection or as audp (unicast) message. How do you design a protocol that uses multicast to send a request to do work? All uses I can think of right now of multicast/broadcast is: * Discovery, like in NIS. * Announcements like in OSPF. * update like in NTP broadcast DHCP is actually a nice example of very very bad things that happen if you loose control of how many servers that are running. On Mon, 2003-03-03 at 23:29, Chris Friesen wrote: Terje Eggestad wrote: > On Mon, 2003-03-03 at 18:09, Chris Friesen wrote: > If you *had* multicast, you don't know *when* a peer proccessed it. > What if the peer is suspended ??? you don't get an error on the send, > and you apparently never get an answer, then what? The peer may also > gone haywire on a while(1); Exactly. So if the message got delivered you have no way of knowing for sure that it was processed and you have application-level timers and stuff. But if the message wasn't delivered to anyone and you know it should have been, then you don't have to wait for the timer to expire to know that they didn't get it. Nice to know, but it help you, how? If there is a subscriber out there that is hung? You need that timer *anyway*. Why the special case? All I see you're trying to do is something like this (just the nonblocking version): do_unix_mcast(message) { alarm(timeout); rc = write(fd_unixmultocast, message, mlen); if (rc == -1 && errno == nosubscribers) goto they_are_all_dead; rc = select( fd_unixmultocast ++); if (rc == -1 && errno = EINTR) goto they_are_all_dead; alarm(0); process_reply(); return; they_all_dead: handle_all_dead_peers(); return; }; Chris -- Chris Friesen | MailStop: 043/33/F10 Nortel Networks | work: (613) 765-0557 3500 Carling Avenue | fax: (613) 765-2986 Nepean, ON K2H 8E9 Canada | email: cfriesen@nortelnetworks.com -- _________________________________________________________________________ Terje Eggestad mailto:terje.eggestad@scali.no Scali Scalable Linux Systems http://www.scali.com Olaf Helsets Vei 6 tel: +47 22 62 89 61 (OFFICE) P.O.Box 150, Oppsal +47 975 31 574 (MOBILE) N-0619 Oslo fax: +47 22 62 89 51 NORWAY _________________________________________________________________________ From terje.eggestad@scali.com Mon Mar 3 18:14:24 2003 Received: with ECARTIS (v1.0.0; list netdev); Mon, 03 Mar 2003 18:14:27 -0800 (PST) Received: from localhost.localdomain (2etnv5.cm.chello.no [80.111.51.24]) by oss.sgi.com (8.11.6/8.11.6) with SMTP id h242ENf31754 for ; Mon, 3 Mar 2003 18:14:23 -0800 Received: from localhost (localhost [127.0.0.1]) by localhost.localdomain (8.12.5/8.12.5) with ESMTP id h23Nc22c028479; Tue, 4 Mar 2003 00:38:03 +0100 Subject: Re: anyone ever done multicast AF_UNIX sockets? From: Terje Eggestad To: Chris Friesen Cc: "David S. Miller" , linux-kernel@vger.kernel.org, netdev@oss.sgi.com, linux-net@vger.kernel.org In-Reply-To: <3E63CA08.4040209@nortelnetworks.com> References: <3E6399F1.10303@nortelnetworks.com> <20030303.095641.87696857.davem@redhat.c om> <3E63A8CB.2090307@nortelnetworks.com> <20030303.105646.02089773.davem@redhat.com> <1046720532.28127.213.camel@eggis1> <3E63CA08.4040209@nortelnetworks.com> Content-Type: text/plain Content-Transfer-Encoding: 7bit X-Mailer: Ximian Evolution 1.0.8 (1.0.8-10) Date: 04 Mar 2003 00:38:02 +0100 Message-Id: <1046734683.28127.275.camel@eggis1> Mime-Version: 1.0 X-archive-position: 1855 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: terje.eggestad@scali.com Precedence: bulk X-list: netdev Content-Length: 2975 Lines: 71 The latency I belive, a 25% increase don't matter all that much. ( routinely send meesages sub micro second. that tcp BW is ridiculus low, make sure that you run with with good sized socket buffers, and that tcp windowing is enabled. But then again, if you want to send much data fast between processes, a stream socket is a pretty bad idea anyway. A) shm b) mmap a file, write into it, and send the filenake to the other side, then mmap it there. Don't underestemate the BW of a fedex'ed tape. TJ On Mon, 2003-03-03 at 22:32, Chris Friesen wrote: Terje Eggestad wrote: > On Mon, 2003-03-03 at 19:56, David S. Miller wrote: > TCP bandwidth is slightly faster than AF_UNIX bandwidth on my > sparc64 boxes for example. > > I've seen that their are the same on linux.I tried to to do AF_UNIX > instead of AF_INET internally to boost perf, but to no avail. Makes you > suspect that the loopback device actually create an AF_UNIX connection > under the hood ;-) On my P4 1.8GHz, AF_INET vs AF_UNIX looks like this: *Local* Communication latencies in microseconds - smaller is better ------------------------------------------------------------- Host OS 2p/0K Pipe AF UDP RPC/ TCP RPC/ TCP ctxsw UNIX UDP TCP conn --------- ------- ----- ----- ---- ----- ----- ----- ----- ---- pcard0ks. 2.4.18- 1.740 10.4 15.9 20.1 33.1 23.5 44.3 72.7 pcard0ks. 2.4.18- 1.560 10.6 16.0 23.4 38.1 36.1 44.6 77.4 *Local* Communication bandwidths in MB/s - bigger is better ----------------------------------------------------------- Host OS Pipe AF TCP File Mmap Bcopy Bcopy Mem Mem UNIX reread reread (libc) (hand) read write --------- ------- ---- ---- ---- ------ ------ ------ ------ ---- ----- pcard0ks. 2.4.18- 650. 677. 151. 721.9 958.0 290.8 288.8 955. 418.4 pcard0ks. 2.4.18- 379. 701. 163. 714.8 949.5 289.5 288.5 956. 420.5 On this machine at least, UDP latency is 25% worse than AF_UNIX, and TCP bandwidth is about 22% that of AF_UNIX. Chris -- Chris Friesen | MailStop: 043/33/F10 Nortel Networks | work: (613) 765-0557 3500 Carling Avenue | fax: (613) 765-2986 Nepean, ON K2H 8E9 Canada | email: cfriesen@nortelnetworks.com -- _________________________________________________________________________ Terje Eggestad mailto:terje.eggestad@scali.no Scali Scalable Linux Systems http://www.scali.com Olaf Helsets Vei 6 tel: +47 22 62 89 61 (OFFICE) P.O.Box 150, Oppsal +47 975 31 574 (MOBILE) N-0619 Oslo fax: +47 22 62 89 51 NORWAY _________________________________________________________________________ From hadi@cyberus.ca Mon Mar 3 18:38:44 2003 Received: with ECARTIS (v1.0.0; list netdev); Mon, 03 Mar 2003 18:38:48 -0800 (PST) Received: from mx02.cyberus.ca (mx02.cyberus.ca [216.191.240.26]) by oss.sgi.com (8.11.6/8.11.6) with SMTP id h242cif03294 for ; Mon, 3 Mar 2003 18:38:44 -0800 Received: from shell.cyberus.ca ([216.191.240.114]) by mx02.cyberus.ca with esmtp (Exim 4.10) id 18q2KJ-000JPB-00; Mon, 03 Mar 2003 21:38:43 -0500 Received: from shell.cyberus.ca (localhost.cyberus.ca [127.0.0.1]) by shell.cyberus.ca (8.12.6/8.12.6) with ESMTP id h242cIqu068027; Mon, 3 Mar 2003 21:38:18 -0500 (EST) (envelope-from hadi@cyberus.ca) Received: from localhost (hadi@localhost) by shell.cyberus.ca (8.12.6/8.12.6/Submit) with ESMTP id h242cHmE068024; Mon, 3 Mar 2003 21:38:17 -0500 (EST) (envelope-from hadi@cyberus.ca) X-Authentication-Warning: shell.cyberus.ca: hadi owned process doing -bs Date: Mon, 3 Mar 2003 21:38:17 -0500 (EST) From: jamal To: Terje Eggestad cc: Chris Friesen , linux-kernel , "" , "" , "" Subject: Re: anyone ever done multicast AF_UNIX sockets? In-Reply-To: <1046734165.27924.263.camel@eggis1> Message-ID: <20030303212628.M67734@shell.cyberus.ca> References: <3E5E7081.6020704@nortelnetworks.com> <1046695876.7731.78.camel@pc-16.office .scali.no> <3E638C51.2000904@nortelnetworks.com> <1046720360.28127.209.camel@eggis1> <3E63D73A.2000402@nortelnetworks.com> <1046734165.27924.263.camel@eggis1> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-archive-position: 1856 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: hadi@cyberus.ca Precedence: bulk X-list: netdev Content-Length: 635 Lines: 27 Hi Terje, On Mon, 4 Mar 2003, Terje Eggestad wrote: > How do you design a protocol that uses multicast to send a request to do > work? > > All uses I can think of right now of multicast/broadcast is: > * Discovery, like in NIS. > * Announcements like in OSPF. > * update like in NTP broadcast > I know we are digressing away from main discussion ... The concept of reliable multicast is known to be useful. Look at(for some sample apps): http://www.ietf.org/html.charters/rmt-charter.html But we are talking about a distributed system in that context. Agreed, reliability and multicast do not always make sense. cheers, jamal From hshmulik@intel.com Tue Mar 4 09:11:47 2003 Received: with ECARTIS (v1.0.0; list netdev); Tue, 04 Mar 2003 09:11:50 -0800 (PST) Received: from caduceus.fm.intel.com (fmr02.intel.com [192.55.52.25]) by oss.sgi.com (8.11.6/8.11.6) with SMTP id h24HBkf16696 for ; Tue, 4 Mar 2003 09:11:46 -0800 Received: from petasus.fm.intel.com (petasus.fm.intel.com [10.1.192.37]) by caduceus.fm.intel.com (8.11.6/8.11.6/d: outer.mc,v 1.51 2002/09/23 20:43:23 dmccart Exp $) with ESMTP id h24H5Qe12437 for ; Tue, 4 Mar 2003 17:05:26 GMT Received: from fmsmsxvs042.fm.intel.com (fmsmsxvs042.fm.intel.com [132.233.42.128]) by petasus.fm.intel.com (8.11.6/8.11.6/d: inner.mc,v 1.28 2003/01/13 19:44:39 dmccart Exp $) with SMTP id h24H6GT18665 for ; Tue, 4 Mar 2003 17:06:16 GMT Received: from jrslxjul4.npdj.intel.com ([10.12.254.188]) by fmsmsxvs042.fm.intel.com (NAVGW 2.5.2.11) with SMTP id M2003030409142528026 ; Tue, 04 Mar 2003 09:14:26 -0800 Date: Tue, 4 Mar 2003 19:11:42 +0200 (IST) From: Shmulik Hen X-X-Sender: hshmulik@jrslxjul4.npdj.intel.com To: bonding-devel@lists.sourceforge.net, , cc: jgarzik@pobox.com Subject: [PATCH][bonding] division by zero bug Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-archive-position: 1857 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: hshmulik@intel.com Precedence: bulk X-list: netdev Content-Length: 1165 Lines: 34 The following patch fixes a division by zero bug in the bonding module that happens when transmitting through a bond with no slaves, in the XOR bonding mode. The patch is against bonding-2.4.20-20030207 in sorceforge (http://sourceforge.net/projects/bonding/). diff -urN linux-2.4.20-20030207/drivers/net/bonding.c linux-2.4.20-devel/drivers/net/bonding.c --- linux-2.4.20-20030207/drivers/net/bonding.c 2003-03-02 14:01:46.000000000 +0200 +++ linux-2.4.20-devel/drivers/net/bonding.c 2003-03-02 14:35:04.000000000 +0200 @@ -2597,6 +2597,13 @@ return 0; } + if (bond->slave_cnt == 0) { + /* no slaves in the bond, frame not sent */ + dev_kfree_skb(skb); + read_unlock_irqrestore(&bond->lock, flags); + return 0; + } + slave_no = (data->h_dest[5]^slave->dev->dev_addr[5]) % bond->slave_cnt; while ( (slave_no > 0) && (slave != (slave_t *)bond) ) { -- | Shmulik Hen | | Israel Design Center (Jerusalem) | | LAN Access Division | | Intel Communications Group, Intel corp. | | | | Anti-Spam: shmulik dot hen at intel dot com | From ahu@outpost.ds9a.nl Wed Mar 5 03:28:59 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 05 Mar 2003 03:29:01 -0800 (PST) Received: from outpost.ds9a.nl (outpost.ds9a.nl [213.244.168.210]) by oss.sgi.com (8.11.6/8.11.6) with SMTP id h25BSvf12503 for ; Wed, 5 Mar 2003 03:28:58 -0800 Received: by outpost.ds9a.nl (Postfix, from userid 1000) id CEBE74501; Wed, 5 Mar 2003 12:28:52 +0100 (CET) Date: Wed, 5 Mar 2003 12:28:52 +0100 From: bert hubert To: Andreas Jellinghaus Cc: mit_warlord@users.sourceforge.net, linux-kernel@vger.kernel.org, netdev@oss.sgi.com Subject: Re: ipsec-tools 0.1 + kernel 2.5.64 Message-ID: <20030305112852.GA22351@outpost.ds9a.nl> Mail-Followup-To: bert hubert , Andreas Jellinghaus , mit_warlord@users.sourceforge.net, linux-kernel@vger.kernel.org, netdev@oss.sgi.com References: <1046863752.441.7.camel@simulacron> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1046863752.441.7.camel@simulacron> User-Agent: Mutt/1.3.28i X-archive-position: 1858 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: ahu@ds9a.nl Precedence: bulk X-list: netdev Content-Length: 728 Lines: 22 On Wed, Mar 05, 2003 at 12:29:12PM +0100, Andreas Jellinghaus wrote: > Hi, > > both manual keying and automatic keying with racoon (pre-shared secret) > are working fine. No need to patch or modify anything. > I tried only ipv4. By the way, regarding ipsec-tools 0.1, are you sure you want to fork the projects involved? By the way, you did not mention it here but ipsec-tools is available on http://sourceforge.net/projects/ipsec-tools , I also link them from http://lartc.org/howto/lartc.ipsec.html Regards, bert -- http://www.PowerDNS.com Open source, database driven DNS Software http://lartc.org Linux Advanced Routing & Traffic Control HOWTO http://netherlabs.nl Consulting From linux-netdev@gmane.org Wed Mar 5 04:31:28 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 05 Mar 2003 04:31:32 -0800 (PST) Received: from main.gmane.org (main.gmane.org [80.91.224.249]) by oss.sgi.com (8.11.6/8.11.6) with SMTP id h25CVQf14743 for ; Wed, 5 Mar 2003 04:31:27 -0800 Received: from root by main.gmane.org with local (Exim 3.35 #1 (Debian)) id 18qY2T-0005EJ-00 for ; Wed, 05 Mar 2003 13:30:25 +0100 X-Injected-Via-Gmane: http://gmane.org/ To: netdev@oss.sgi.com Received: from news by main.gmane.org with local (Exim 3.35 #1 (Debian)) id 18qRvo-0001Rd-00 for ; Wed, 05 Mar 2003 06:59:08 +0100 From: "jpaul" Subject: SuSE 8.1 Wireless Network Date: Wed, 05 Mar 2003 07:03:35 +0100 Message-ID: Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 8bit X-Complaints-To: usenet@main.gmane.org User-Agent: Pan/0.13.3 (That cat's something I can't explain) X-archive-position: 1859 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jpaulb@web.de Precedence: bulk X-list: netdev Content-Length: 236 Lines: 9 I would like to put together a wireless network. I was thinking about the linksys WEFW11S DSL router and the WUSD11 USD wireless adapter for a laptop Has anyone any experence with these?? Easy of setup if they work at all etc. Paul From eric@lammerts.org Wed Mar 5 06:11:27 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 05 Mar 2003 06:11:29 -0800 (PST) Received: from ezri.xs4all.nl (ezri.xs4all.nl [194.109.253.9]) by oss.sgi.com (8.11.6/8.11.6) with SMTP id h25EBPf16010 for ; Wed, 5 Mar 2003 06:11:26 -0800 Received: (qmail 16995 invoked by uid 502); 5 Mar 2003 14:11:23 -0000 Date: Wed, 5 Mar 2003 15:11:23 +0100 From: Eric Lammerts To: linux-net@vger.kernel.org, netdev@oss.sgi.com Cc: alan@lxorguk.ukuu.org.uk Subject: [PATCH] wrong ENETDOWN in af_packet? Message-ID: <20030305141123.GA16699@ally.lammerts.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.4i X-archive-position: 1860 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: eric@lammerts.org Precedence: bulk X-list: netdev Content-Length: 2599 Lines: 120 Hi, I have a program that goes like this (source code at end of mail): open PF_PACKET socket look up index of interface x bind to that interface bring interface x down (~IFF_UP) bring interface x up (IFF_UP|IFF_RUNNING) for(;;) { recvfrom() } Problem: the first recvfrom() always results in ENETDOWN. The reason is that (in af_packet.c) packet_notifier(NETDEV_DOWN) sets sk->err to ENETDOWN, but packet_notifier(NETDEV_UP) doesn't clear it. Is this behaviour deliberate? If not, I suggest the following patch: diff -u -r1.1.1.1 af_packet.c --- linux-2.4.19/net/packet/af_packet.c 10 Jan 2003 16:20:09 -0000 1.1.1.1 +++ linux-2.4.19/net/packet/af_packet.c 5 Mar 2003 11:04:33 -0000 @@ -1407,6 +1407,7 @@ dev_add_pack(&po->prot_hook); sock_hold(sk); po->running = 1; + sk->err = 0; } spin_unlock(&po->bind_lock); #ifdef CONFIG_PACKET_MULTICAST Currently I work around the problem by doing a getsockopt(x, SOL_SOCKET, SO_ERROR,...) to clear the error variable. Eric #include #include #include #include #include #include #include #include #include #include void modify_iface_flags(int sock, char *device_name, short set, short reset) { struct ifreq ifr; strncpy(ifr.ifr_name, device_name, IFNAMSIZ); if(ioctl(sock, SIOCGIFFLAGS, &ifr) < 0) { perror("SIOCGIFFLAGS"); exit(1); } ifr.ifr_flags |= set; ifr.ifr_flags &= ~reset; strncpy(ifr.ifr_name, device_name, IFNAMSIZ); if(ioctl(sock, SIOCSIFFLAGS, &ifr) < 0) { perror("SIOCSIFFLAGS"); exit(1); } } void bind_to_iface(int sock, char *ifacename) { struct ifreq ifr; struct sockaddr_ll sa; strncpy(ifr.ifr_name, ifacename, IFNAMSIZ); if(ioctl(sock, SIOCGIFINDEX, &ifr) < 0) { perror("ioctl SIOCGIFINDEX"); exit(1); } sa.sll_family = AF_PACKET; sa.sll_ifindex = ifr.ifr_ifindex; if(bind(sock, (struct sockaddr *)&sa, sizeof(sa)) < 0) { perror("bind"); exit(1); } } int main() { int fd, sz; char iface[] = "eth0"; unsigned char data[1518]; struct sockaddr_ll sa; socklen_t salen; fd = socket(PF_PACKET, SOCK_RAW, htons(ETH_P_ALL)); if(fd < 0) { perror("socket"); exit(1); } bind_to_iface(fd, iface); // bring it down modify_iface_flags(fd, iface, 0, IFF_UP); // bring it up modify_iface_flags(fd, iface, IFF_UP | IFF_RUNNING, 0); //receive packet salen = sizeof(sa); sz = recvfrom(fd, data, sizeof(data), 0, (struct sockaddr *)&sa, &salen); if(sz == -1) { perror("recvfrom"); exit(1); } return 0; } From agx@linux.it Wed Mar 5 06:24:38 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 05 Mar 2003 06:24:41 -0800 (PST) Received: from ax-agx.axnet.it (dns.axnet.it [217.59.82.2]) by oss.sgi.com (8.11.6/8.11.6) with SMTP id h25EObf16832 for ; Wed, 5 Mar 2003 06:24:38 -0800 Received: by ax-agx.axnet.it (Postfix, from userid 1000) id DF4354234D; Wed, 5 Mar 2003 15:27:47 +0100 (CET) Date: Wed, 5 Mar 2003 15:27:47 +0100 From: Antonio Gallo To: mitch@sfgoth.com Cc: netdev@oss.sgi.com Subject: [Bug] PPPoATM or ATM module problem with ADSL PCI Cards Message-ID: <20030305142747.GA17315@ax-agx.axnet.it> Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline User-Agent: Mutt/1.4i X-Disclaimer: Please visit http://www.badpenguin.org/ X-Operating-System: Bad Penguin GNU/Linux 0.99.7 X-archive-position: 1861 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: agx@linux.it Precedence: bulk X-list: netdev Content-Length: 2907 Lines: 79 I'm not sure if is a real bug or something wrong but after 3 weeks of tests i'm thinking is something inside the kernel (pppoatm.o) or the ATM layer in general, and so some friend at @linux.it suggest me to contact directly the mantainers instead of writing to the kernel ML. This is the situation: LAN <=== ethernet ===> Zyxel Router <=== PPPoA ===> Provider This works! I remove the phone cable from the router and put it into our Linux Box Linux <=== PPPoA ===> Provider I've used 2 different kind of cards: 1. BeWan PCI ADSL ST (ATM driver + pppd with the atm plugin) 2. Pulsar ADSL (provide /dev/ttyG0 so i just use normal ppp ) Boths card can detect the line (Link Up), i can also see the "link" LED to become "on". Where i run the PPP to connect to the provider i can see the "Tx" LED working but the "Rx" LED not. Confirmation of this is done through "ifconfig" or "cat /proc/net/atm/UNICORN:0" that showme a positive number for Tx and Error number for Rx. I contacted the provider of the line and it told me the right parameter of the line, that i was already know. Provider: Elitel (via Telecom Italia) - Italy Line: DTM Protocol VCMUX, RFC2364, PPPoA VPI.VCI: 8.35 Bandwidth: 128 up / 640 down User access: username+password (PPP CHAP/PAP) Ip address: Dynamically assigned I stayed with them 1 hour on the phone. Infact they was able to see that i am transmitting "atm cells" but that cells contains invalid data and so they dropped them. This is the reason why i never see Rx packet? So whereis the problem? Kernel or ppp pluging sending wrong information? If the problem was the card it will be strange to have the same problem on the same card (different drivers, different architectures, different chipset etc.). So is my machine really sending wrong atm cells? Debugging of the "Bewan" driver showed this: Mar 5 11:06:53 ax-dummy kernel: unicorn_atmdrv.c : unicorn_atm_open: Mar 5 11:06:53 ax-dummy kernel: unicorn_atm: ESI=00:9f:c8:f1:f7:58 Mar 5 11:06:53 ax-dummy kernel: unicorn_atm: upstream_rate=639 Kbits/s,downstream_rate=6143 Kbits/s Mar 5 11:06:53 ax-dummy kernel: unicorn_atmdrv.c : get_link_rate: link_rate=1507 cells/sec Mar 5 11:06:53 ax-dummy kernel: unicorn_atmdrv.c : aal5_decode: skb to short,skb->len=48,pdu_length=27264 Mar 5 11:06:53 ax-dummy kernel: unicorn_atmdrv.c : rcv_poll: wrong VPI.VCI 15.16 Mar 5 11:06:53 ax-dummy kernel: unicorn_atmdrv.c : rcv_poll: wrong VPI.VCI 15.16 Mar 5 11:06:53 ax-dummy kernel: unicorn_atmdrv.c : aal5_decode: skb to short,skb->len=48,pdu_length=27264 i am also waiting an answer from the support of both ADSL cards. I hope you can give some indication about why i'm sending wrong cells and how to check which/where is the real problem. Ops, i forget to mention that i tested with both 2.4.20 and 2.4.18 kernels. Thank you in advance, Antonio Gallo www.badpenguin.org p.s. i'm really lost :-( From kazunori@miyazawa.org Wed Mar 5 06:30:15 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 05 Mar 2003 06:30:23 -0800 (PST) Received: from miyazawa.org (usen-43x235x12x234.ap-USEN.usen.ad.jp [43.235.12.234]) by oss.sgi.com (8.11.6/8.11.6) with SMTP id h25EUEf17528 for ; Wed, 5 Mar 2003 06:30:14 -0800 Received: from monza.miyazawa.org ([2001:200:0:ff18:220:e0ff:fe8a:e797]) (AUTH: LOGIN kazunori, ) by miyazawa.org with esmtp; Wed, 05 Mar 2003 23:12:12 +0900 Date: Wed, 5 Mar 2003 23:30:25 +0900 From: Kazunori Miyazawa To: davem@redhat.com, kuznet@ms2.inr.ac.ru Cc: linux-kernel@vger.kernel.org, netdev@oss.sgi.com, usagi-core@linux-ipv6.org Subject: [PATH] IPv6 IPsec support Message-Id: <20030305233025.784feb00.kazunori@miyazawa.org> X-Mailer: Sylpheed version 0.8.10 (GTK+ 1.2.10; i386-debian-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 1862 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: kazunori@miyazawa.org Precedence: bulk X-list: netdev Content-Length: 120844 Lines: 4451 Hello, I submit the patch to let the kernel support ipv6 ipsec again. It is able to comple ipv6 as module. This patch incldes a couple of clean-up and changes of function name. Sorry, this patch is for linux-2.5.63. Best Regards, --Kazunori Miyazawa (Yokogawa Electric Corporation) Patch-Name: IPsec Patch-Id: IPSEC_2_5_63_ALL-20030304 Patch-Author: Kazunori Miyazawa Credit: Kazunori Miyazawa , Mitsuru Kanda , YOSHIFUJI Hideaki , Kunihiro Ishiguro This patch make the kernel process IPv6 packet with IPsec. - We've introduced a function pointer (xfrm_dst_lookup) for looking up routing table of each address family to comple ipv6 as module. - We moved some common functions among protocols such as skb_icv_walk() in net/ipv4/{ah.c,esp.c} to net/ipv4/xfrm_algo.c. This is for compling ah / esp and ah6 / esp6 as modules. - We renamed some IPv4 specific xfrm_XXX() functions to xfrm4_XXX(). diff -ruN -x CVS linux-2.5.63/include/linux/ipv6.h linux25/include/linux/ipv6.h --- linux-2.5.63/include/linux/ipv6.h 2003-02-25 04:05:38.000000000 +0900 +++ linux25/include/linux/ipv6.h 2003-03-05 11:30:34.000000000 +0900 @@ -74,6 +74,21 @@ #define rt0_type rt_hdr.type; }; +struct ipv6_auth_hdr { + __u8 nexthdr; + __u8 hdrlen; /* This one is measured in 32 bit units! */ + __u16 reserved; + __u32 spi; + __u32 seq_no; /* Sequence number */ + __u8 auth_data[4]; /* Length variable but >=4. Mind the 64 bit alignment! */ +}; + +struct ipv6_esp_hdr { + __u32 spi; + __u32 seq_no; /* Sequence number */ + __u8 enc_data[8]; /* Length variable but >=8. Mind the 64 bit alignment! */ +}; + /* * IPv6 fixed header * diff -ruN -x CVS linux-2.5.63/include/net/dst.h linux25/include/net/dst.h --- linux-2.5.63/include/net/dst.h 2003-02-25 04:05:44.000000000 +0900 +++ linux25/include/net/dst.h 2003-03-05 17:49:50.000000000 +0900 @@ -247,7 +247,10 @@ struct flowi; extern int xfrm_lookup(struct dst_entry **dst_p, struct flowi *fl, struct sock *sk, int flags); +extern int xfrm6_lookup(struct dst_entry **dst_p, struct flowi *fl, + struct sock *sk, int flags); extern void xfrm_init(void); +extern void xfrm6_init(void); #endif diff -ruN -x CVS linux-2.5.63/include/net/ip6_route.h linux25/include/net/ip6_route.h --- linux-2.5.63/include/net/ip6_route.h 2003-02-25 04:05:12.000000000 +0900 +++ linux25/include/net/ip6_route.h 2003-03-04 20:38:14.000000000 +0900 @@ -38,6 +38,7 @@ extern int ipv6_route_ioctl(unsigned int cmd, void *arg); extern int ip6_route_add(struct in6_rtmsg *rtmsg); +extern int ip6_route_del(struct in6_rtmsg *rtmsg); extern int ip6_del_rt(struct rt6_info *); extern int ip6_rt_addr_add(struct in6_addr *addr, @@ -57,6 +58,8 @@ struct in6_addr *saddr, int oif, int flags); +extern struct rt6_info *ndisc_get_dummy_rt(void); + /* * support functions for ND * diff -ruN -x CVS linux-2.5.63/include/net/xfrm.h linux25/include/net/xfrm.h --- linux-2.5.63/include/net/xfrm.h 2003-02-25 04:05:41.000000000 +0900 +++ linux25/include/net/xfrm.h 2003-03-05 17:49:51.000000000 +0900 @@ -12,6 +12,7 @@ #include #include +#include #define XFRM_ALIGN8(len) (((len) + 7) & ~7) @@ -282,6 +283,7 @@ struct xfrm_dst *next; struct dst_entry dst; struct rtable rt; + struct rt6_info rt6; } u; }; @@ -308,26 +310,42 @@ if (sp && atomic_dec_and_test(&sp->refcnt)) __secpath_destroy(sp); } - -extern int __xfrm_policy_check(struct sock *, int dir, struct sk_buff *skb); +extern int __xfrm_policy_check(struct sock *, int dir, struct sk_buff *skb, unsigned short family); static inline int xfrm_policy_check(struct sock *sk, int dir, struct sk_buff *skb) { if (sk && sk->policy[XFRM_POLICY_IN]) - return __xfrm_policy_check(sk, dir, skb); + return __xfrm_policy_check(sk, dir, skb, AF_INET); return !xfrm_policy_list[dir] || (skb->dst->flags & DST_NOPOLICY) || - __xfrm_policy_check(sk, dir, skb); + __xfrm_policy_check(sk, dir, skb, AF_INET); } -extern int __xfrm_route_forward(struct sk_buff *skb); +static inline int xfrm6_policy_check(struct sock *sk, int dir, struct sk_buff *skb) +{ + if (sk && sk->policy[XFRM_POLICY_IN]) + return __xfrm_policy_check(sk, dir, skb, AF_INET6); + + return !xfrm_policy_list[dir] || + (skb->dst->flags & DST_NOPOLICY) || + __xfrm_policy_check(sk, dir, skb, AF_INET6); +} + +extern int __xfrm_route_forward(struct sk_buff *skb, unsigned short family); static inline int xfrm_route_forward(struct sk_buff *skb) { return !xfrm_policy_list[XFRM_POLICY_OUT] || (skb->dst->flags & DST_NOXFRM) || - __xfrm_route_forward(skb); + __xfrm_route_forward(skb, AF_INET); +} + +static inline int xfrm6_route_forward(struct sk_buff *skb) +{ + return !xfrm_policy_list[XFRM_POLICY_OUT] || + (skb->dst->flags & DST_NOXFRM) || + __xfrm_route_forward(skb, AF_INET6); } extern int __xfrm_sk_clone_policy(struct sock *sk); @@ -380,12 +398,16 @@ extern void xfrm_input_init(void); extern int xfrm_state_walk(u8 proto, int (*func)(struct xfrm_state *, int, void*), void *); extern struct xfrm_state *xfrm_state_alloc(void); -extern struct xfrm_state *xfrm_state_find(u32 daddr, u32 saddr, struct flowi *fl, struct xfrm_tmpl *tmpl, - struct xfrm_policy *pol, int *err); +extern struct xfrm_state *xfrm4_state_find(u32 daddr, u32 saddr, struct flowi *fl, struct xfrm_tmpl *tmpl, + struct xfrm_policy *pol, int *err); +extern struct xfrm_state *xfrm6_state_find(struct in6_addr *daddr, struct in6_addr *saddr, + struct flowi *fl, struct xfrm_tmpl *tmpl, + struct xfrm_policy *pol, int *err); extern int xfrm_state_check_expire(struct xfrm_state *x); extern void xfrm_state_insert(struct xfrm_state *x); extern int xfrm_state_check_space(struct xfrm_state *x, struct sk_buff *skb); -extern struct xfrm_state *xfrm_state_lookup(u32 daddr, u32 spi, u8 proto); +extern struct xfrm_state *xfrm4_state_lookup(u32 daddr, u32 spi, u8 proto); +extern struct xfrm_state *xfrm6_state_lookup(struct in6_addr *daddr, u32 spi, u8 proto); extern struct xfrm_state *xfrm_find_acq_byseq(u32 seq); extern void xfrm_state_delete(struct xfrm_state *x); extern void xfrm_state_flush(u8 proto); @@ -393,17 +415,21 @@ extern void xfrm_replay_advance(struct xfrm_state *x, u32 seq); extern int xfrm_check_selectors(struct xfrm_state **x, int n, struct flowi *fl); extern int xfrm4_rcv(struct sk_buff *skb); +extern int xfrm6_rcv(struct sk_buff *skb); +extern int xfrm6_clear_mutable_options(struct sk_buff *skb, u16 *nh_offset, int dir); extern int xfrm_user_policy(struct sock *sk, int optname, u8 *optval, int optlen); struct xfrm_policy *xfrm_policy_alloc(int gfp); extern int xfrm_policy_walk(int (*func)(struct xfrm_policy *, int, int, void*), void *); -struct xfrm_policy *xfrm_policy_lookup(int dir, struct flowi *fl); +struct xfrm_policy *xfrm_policy_lookup(int dir, struct flowi *fl, unsigned short family); int xfrm_policy_insert(int dir, struct xfrm_policy *policy, int excl); struct xfrm_policy *xfrm_policy_delete(int dir, struct xfrm_selector *sel); struct xfrm_policy *xfrm_policy_byid(int dir, u32 id, int delete); void xfrm_policy_flush(void); void xfrm_alloc_spi(struct xfrm_state *x, u32 minspi, u32 maxspi); struct xfrm_state * xfrm_find_acq(u8 mode, u16 reqid, u8 proto, u32 daddr, u32 saddr, int create); +struct xfrm_state * xfrm6_find_acq(u8 mode, u16 reqid, u8 proto, struct in6_addr *daddr, + struct in6_addr *saddr, int create); extern void xfrm_policy_flush(void); extern void xfrm_policy_kill(struct xfrm_policy *); extern int xfrm_sk_policy_insert(struct sock *sk, int dir, struct xfrm_policy *pol); @@ -425,23 +451,129 @@ extern struct xfrm_algo_desc *xfrm_aalg_get_byname(char *name); extern struct xfrm_algo_desc *xfrm_ealg_get_byname(char *name); +static __inline__ int addr_match(void *token1, void *token2, int prefixlen) +{ + __u32 *a1 = token1; + __u32 *a2 = token2; + int pdw; + int pbi; + + pdw = prefixlen >> 5; /* num of whole __u32 in prefix */ + pbi = prefixlen & 0x1f; /* num of bits in incomplete u32 in prefix */ + + if (pdw) + if (memcmp(a1, a2, pdw << 2)) + return 0; + + if (pbi) { + __u32 mask; + + mask = htonl((0xffffffff) << (32 - pbi)); + + if ((a1[pdw] ^ a2[pdw]) & mask) + return 0; + } + + return 1; +} + static inline int xfrm6_selector_match(struct xfrm_selector *sel, struct flowi *fl) { - return !memcmp(fl->fl6_dst, sel->daddr.a6, sizeof(struct in6_addr)) && - !((fl->uli_u.ports.dport^sel->dport)&sel->dport_mask) && - !((fl->uli_u.ports.sport^sel->sport)&sel->sport_mask) && - (fl->proto == sel->proto || !sel->proto) && - (fl->oif == sel->ifindex || !sel->ifindex) && - !memcmp(fl->fl6_src, sel->saddr.a6, sizeof(struct in6_addr)); + return addr_match(fl->fl6_dst, &sel->daddr, sel->prefixlen_d) && + addr_match(fl->fl6_src, &sel->saddr, sel->prefixlen_s) && + !((fl->uli_u.ports.dport^sel->dport)&sel->dport_mask) && + !((fl->uli_u.ports.sport^sel->sport)&sel->sport_mask) && + (fl->proto == sel->proto || !sel->proto) && + (fl->oif == sel->ifindex || !sel->ifindex); } extern int xfrm6_register_type(struct xfrm_type *type); extern int xfrm6_unregister_type(struct xfrm_type *type); extern struct xfrm_type *xfrm6_get_type(u8 proto); -extern struct xfrm_state *xfrm6_state_lookup(struct in6_addr *daddr, u32 spi, u8 proto); -struct xfrm_state * xfrm6_find_acq(u8 mode, u16 reqid, u8 proto, struct in6_addr *daddr, struct in6_addr *saddr, int create); -void xfrm6_alloc_spi(struct xfrm_state *x, u32 minspi, u32 maxspi); +struct ah_data +{ + u8 *key; + int key_len; + u8 *work_icv; + int icv_full_len; + int icv_trunc_len; + + void (*icv)(struct ah_data*, + struct sk_buff *skb, u8 *icv); + + struct crypto_tfm *tfm; +}; + +struct esp_data +{ + /* Confidentiality */ + struct { + u8 *key; /* Key */ + int key_len; /* Key length */ + u8 *ivec; /* ivec buffer */ + /* ivlen is offset from enc_data, where encrypted data start. + * It is logically different of crypto_tfm_alg_ivsize(tfm). + * We assume that it is either zero (no ivec), or + * >= crypto_tfm_alg_ivsize(tfm). */ + int ivlen; + int padlen; /* 0..255 */ + struct crypto_tfm *tfm; /* crypto handle */ + } conf; + + /* Integrity. It is active when icv_full_len != 0 */ + struct { + u8 *key; /* Key */ + int key_len; /* Length of the key */ + u8 *work_icv; + int icv_full_len; + int icv_trunc_len; + void (*icv)(struct esp_data*, + struct sk_buff *skb, + int offset, int len, u8 *icv); + struct crypto_tfm *tfm; + } auth; +}; + +typedef void (icv_update_fn_t)(struct crypto_tfm *, struct scatterlist *, unsigned int); +extern void skb_ah_walk(const struct sk_buff *skb, + struct crypto_tfm *tfm, icv_update_fn_t icv_update); +extern void skb_icv_walk(const struct sk_buff *skb, struct crypto_tfm *tfm, + int offset, int len, icv_update_fn_t icv_update); +extern int skb_to_sgvec(struct sk_buff *skb, struct scatterlist *sg, int offset, int len); +extern int skb_cow_data(struct sk_buff *skb, int tailbits, struct sk_buff **trailer); +extern void *pskb_put(struct sk_buff *skb, struct sk_buff *tail, int len); + +static inline void +ah_hmac_digest(struct ah_data *ahp, struct sk_buff *skb, u8 *auth_data) +{ + struct crypto_tfm *tfm = ahp->tfm; + + memset(auth_data, 0, ahp->icv_trunc_len); + crypto_hmac_init(tfm, ahp->key, &ahp->key_len); + skb_ah_walk(skb, tfm, crypto_hmac_update); + crypto_hmac_final(tfm, ahp->key, &ahp->key_len, ahp->work_icv); + memcpy(auth_data, ahp->work_icv, ahp->icv_trunc_len); +} + +static inline void +esp_hmac_digest(struct esp_data *esp, struct sk_buff *skb, int offset, + int len, u8 *auth_data) +{ + struct crypto_tfm *tfm = esp->auth.tfm; + char *icv = esp->auth.work_icv; + + memset(auth_data, 0, esp->auth.icv_trunc_len); + crypto_hmac_init(tfm, esp->auth.key, &esp->auth.key_len); + skb_icv_walk(skb, tfm, offset, len, crypto_hmac_update); + crypto_hmac_final(tfm, esp->auth.key, &esp->auth.key_len, icv); + memcpy(auth_data, icv, esp->auth.icv_trunc_len); +} + + +typedef int (xfrm_dst_lookup_t)(struct xfrm_dst **dst, struct flowi *fl); +int xfrm_dst_lookup_register(xfrm_dst_lookup_t *dst_lookup, unsigned short family); +void xfrm_dst_lookup_unregister(unsigned short family); #endif /* _NET_XFRM_H */ diff -ruN -x CVS linux-2.5.63/net/ipv4/ah.c linux25/net/ipv4/ah.c --- linux-2.5.63/net/ipv4/ah.c 2003-02-25 04:05:42.000000000 +0900 +++ linux25/net/ipv4/ah.c 2003-03-05 17:49:52.000000000 +0900 @@ -7,25 +7,8 @@ #include #include -#define AH_HLEN_NOICV 12 - -typedef void (icv_update_fn_t)(struct crypto_tfm *, - struct scatterlist *, unsigned int); - -struct ah_data -{ - u8 *key; - int key_len; - u8 *work_icv; - int icv_full_len; - int icv_trunc_len; - - void (*icv)(struct ah_data*, - struct sk_buff *skb, u8 *icv); - - struct crypto_tfm *tfm; -}; +#define AH_HLEN_NOICV 12 /* Clear mutable options and find final destination to substitute * into IP header for icv calculation. Options are already checked @@ -71,92 +54,6 @@ return 0; } -static void skb_ah_walk(const struct sk_buff *skb, - struct crypto_tfm *tfm, icv_update_fn_t icv_update) -{ - int offset = 0; - int len = skb->len; - int start = skb->len - skb->data_len; - int i, copy = start - offset; - struct scatterlist sg; - - /* Checksum header. */ - if (copy > 0) { - if (copy > len) - copy = len; - - sg.page = virt_to_page(skb->data + offset); - sg.offset = (unsigned long)(skb->data + offset) % PAGE_SIZE; - sg.length = copy; - - icv_update(tfm, &sg, 1); - - if ((len -= copy) == 0) - return; - offset += copy; - } - - for (i = 0; i < skb_shinfo(skb)->nr_frags; i++) { - int end; - - BUG_TRAP(start <= offset + len); - - end = start + skb_shinfo(skb)->frags[i].size; - if ((copy = end - offset) > 0) { - skb_frag_t *frag = &skb_shinfo(skb)->frags[i]; - - if (copy > len) - copy = len; - - sg.page = frag->page; - sg.offset = frag->page_offset + offset-start; - sg.length = copy; - - icv_update(tfm, &sg, 1); - - if (!(len -= copy)) - return; - offset += copy; - } - start = end; - } - - if (skb_shinfo(skb)->frag_list) { - struct sk_buff *list = skb_shinfo(skb)->frag_list; - - for (; list; list = list->next) { - int end; - - BUG_TRAP(start <= offset + len); - - end = start + list->len; - if ((copy = end - offset) > 0) { - if (copy > len) - copy = len; - skb_ah_walk(list, tfm, icv_update); - if ((len -= copy) == 0) - return; - offset += copy; - } - start = end; - } - } - if (len) - BUG(); -} - -static void -ah_hmac_digest(struct ah_data *ahp, struct sk_buff *skb, u8 *auth_data) -{ - struct crypto_tfm *tfm = ahp->tfm; - - memset(auth_data, 0, ahp->icv_trunc_len); - crypto_hmac_init(tfm, ahp->key, &ahp->key_len); - skb_ah_walk(skb, tfm, crypto_hmac_update); - crypto_hmac_final(tfm, ahp->key, &ahp->key_len, ahp->work_icv); - memcpy(auth_data, ahp->work_icv, ahp->icv_trunc_len); -} - static int ah_output(struct sk_buff *skb) { int err; @@ -330,7 +227,7 @@ skb->h.icmph->code != ICMP_FRAG_NEEDED) return; - x = xfrm_state_lookup(iph->daddr, ah->spi, IPPROTO_AH); + x = xfrm4_state_lookup(iph->daddr, ah->spi, IPPROTO_AH); if (!x) return; printk(KERN_DEBUG "pmtu discvovery on SA AH/%08x/%08x\n", diff -ruN -x CVS linux-2.5.63/net/ipv4/esp.c linux25/net/ipv4/esp.c --- linux-2.5.63/net/ipv4/esp.c 2003-02-25 04:05:34.000000000 +0900 +++ linux25/net/ipv4/esp.c 2003-03-05 17:49:52.000000000 +0900 @@ -8,312 +8,8 @@ #include #include -#define MAX_SG_ONSTACK 4 - -typedef void (icv_update_fn_t)(struct crypto_tfm *, - struct scatterlist *, unsigned int); - -/* BUGS: - * - we assume replay seqno is always present. - */ - -struct esp_data -{ - /* Confidentiality */ - struct { - u8 *key; /* Key */ - int key_len; /* Key length */ - u8 *ivec; /* ivec buffer */ - /* ivlen is offset from enc_data, where encrypted data start. - * It is logically different of crypto_tfm_alg_ivsize(tfm). - * We assume that it is either zero (no ivec), or - * >= crypto_tfm_alg_ivsize(tfm). */ - int ivlen; - int padlen; /* 0..255 */ - struct crypto_tfm *tfm; /* crypto handle */ - } conf; - - /* Integrity. It is active when icv_full_len != 0 */ - struct { - u8 *key; /* Key */ - int key_len; /* Length of the key */ - u8 *work_icv; - int icv_full_len; - int icv_trunc_len; - void (*icv)(struct esp_data*, - struct sk_buff *skb, - int offset, int len, u8 *icv); - struct crypto_tfm *tfm; - } auth; -}; - -/* Move to common area: it is shared with AH. */ - -void skb_icv_walk(const struct sk_buff *skb, struct crypto_tfm *tfm, - int offset, int len, icv_update_fn_t icv_update) -{ - int start = skb->len - skb->data_len; - int i, copy = start - offset; - struct scatterlist sg; - - /* Checksum header. */ - if (copy > 0) { - if (copy > len) - copy = len; - - sg.page = virt_to_page(skb->data + offset); - sg.offset = (unsigned long)(skb->data + offset) % PAGE_SIZE; - sg.length = copy; - - icv_update(tfm, &sg, 1); - - if ((len -= copy) == 0) - return; - offset += copy; - } - - for (i = 0; i < skb_shinfo(skb)->nr_frags; i++) { - int end; - - BUG_TRAP(start <= offset + len); - - end = start + skb_shinfo(skb)->frags[i].size; - if ((copy = end - offset) > 0) { - skb_frag_t *frag = &skb_shinfo(skb)->frags[i]; - - if (copy > len) - copy = len; - - sg.page = frag->page; - sg.offset = frag->page_offset + offset-start; - sg.length = copy; - - icv_update(tfm, &sg, 1); - - if (!(len -= copy)) - return; - offset += copy; - } - start = end; - } - - if (skb_shinfo(skb)->frag_list) { - struct sk_buff *list = skb_shinfo(skb)->frag_list; - - for (; list; list = list->next) { - int end; - - BUG_TRAP(start <= offset + len); - - end = start + list->len; - if ((copy = end - offset) > 0) { - if (copy > len) - copy = len; - skb_icv_walk(list, tfm, offset-start, copy, icv_update); - if ((len -= copy) == 0) - return; - offset += copy; - } - start = end; - } - } - if (len) - BUG(); -} - - -/* Looking generic it is not used in another places. */ - -int -skb_to_sgvec(struct sk_buff *skb, struct scatterlist *sg, int offset, int len) -{ - int start = skb->len - skb->data_len; - int i, copy = start - offset; - int elt = 0; - - if (copy > 0) { - if (copy > len) - copy = len; - sg[elt].page = virt_to_page(skb->data + offset); - sg[elt].offset = (unsigned long)(skb->data + offset) % PAGE_SIZE; - sg[elt].length = copy; - elt++; - if ((len -= copy) == 0) - return elt; - offset += copy; - } - - for (i = 0; i < skb_shinfo(skb)->nr_frags; i++) { - int end; - - BUG_TRAP(start <= offset + len); - - end = start + skb_shinfo(skb)->frags[i].size; - if ((copy = end - offset) > 0) { - skb_frag_t *frag = &skb_shinfo(skb)->frags[i]; - - if (copy > len) - copy = len; - sg[elt].page = frag->page; - sg[elt].offset = frag->page_offset+offset-start; - sg[elt].length = copy; - elt++; - if (!(len -= copy)) - return elt; - offset += copy; - } - start = end; - } - - if (skb_shinfo(skb)->frag_list) { - struct sk_buff *list = skb_shinfo(skb)->frag_list; - - for (; list; list = list->next) { - int end; - - BUG_TRAP(start <= offset + len); - - end = start + list->len; - if ((copy = end - offset) > 0) { - if (copy > len) - copy = len; - elt += skb_to_sgvec(list, sg+elt, offset - start, copy); - if ((len -= copy) == 0) - return elt; - offset += copy; - } - start = end; - } - } - if (len) - BUG(); - return elt; -} - -/* Common with AH after some work on arguments. */ - -static void -esp_hmac_digest(struct esp_data *esp, struct sk_buff *skb, int offset, - int len, u8 *auth_data) -{ - struct crypto_tfm *tfm = esp->auth.tfm; - char *icv = esp->auth.work_icv; - - memset(auth_data, 0, esp->auth.icv_trunc_len); - crypto_hmac_init(tfm, esp->auth.key, &esp->auth.key_len); - skb_icv_walk(skb, tfm, offset, len, crypto_hmac_update); - crypto_hmac_final(tfm, esp->auth.key, &esp->auth.key_len, icv); - memcpy(auth_data, icv, esp->auth.icv_trunc_len); -} - -/* Check that skb data bits are writable. If they are not, copy data - * to newly created private area. If "tailbits" is given, make sure that - * tailbits bytes beyond current end of skb are writable. - * - * Returns amount of elements of scatterlist to load for subsequent - * transformations and pointer to writable trailer skb. - */ - -int skb_cow_data(struct sk_buff *skb, int tailbits, struct sk_buff **trailer) -{ - int copyflag; - int elt; - struct sk_buff *skb1, **skb_p; - - /* If skb is cloned or its head is paged, reallocate - * head pulling out all the pages (pages are considered not writable - * at the moment even if they are anonymous). - */ - if ((skb_cloned(skb) || skb_shinfo(skb)->nr_frags) && - __pskb_pull_tail(skb, skb_pagelen(skb)-skb_headlen(skb)) == NULL) - return -ENOMEM; - - /* Easy case. Most of packets will go this way. */ - if (!skb_shinfo(skb)->frag_list) { - /* A little of trouble, not enough of space for trailer. - * This should not happen, when stack is tuned to generate - * good frames. OK, on miss we reallocate and reserve even more - * space, 128 bytes is fair. */ - - if (skb_tailroom(skb) < tailbits && - pskb_expand_head(skb, 0, tailbits-skb_tailroom(skb)+128, GFP_ATOMIC)) - return -ENOMEM; - - /* Voila! */ - *trailer = skb; - return 1; - } - - /* Misery. We are in troubles, going to mincer fragments... */ - elt = 1; - skb_p = &skb_shinfo(skb)->frag_list; - copyflag = 0; - - while ((skb1 = *skb_p) != NULL) { - int ntail = 0; - - /* The fragment is partially pulled by someone, - * this can happen on input. Copy it and everything - * after it. */ - - if (skb_shared(skb1)) - copyflag = 1; - - /* If the skb is the last, worry about trailer. */ - - if (skb1->next == NULL && tailbits) { - if (skb_shinfo(skb1)->nr_frags || - skb_shinfo(skb1)->frag_list || - skb_tailroom(skb1) < tailbits) - ntail = tailbits + 128; - } - - if (copyflag || - skb_cloned(skb1) || - ntail || - skb_shinfo(skb1)->nr_frags || - skb_shinfo(skb1)->frag_list) { - struct sk_buff *skb2; - - /* Fuck, we are miserable poor guys... */ - if (ntail == 0) - skb2 = skb_copy(skb1, GFP_ATOMIC); - else - skb2 = skb_copy_expand(skb1, - skb_headroom(skb1), - ntail, - GFP_ATOMIC); - if (unlikely(skb2 == NULL)) - return -ENOMEM; - - if (skb1->sk) - skb_set_owner_w(skb, skb1->sk); - - /* Looking around. Are we still alive? - * OK, link new skb, drop old one */ - - skb2->next = skb1->next; - *skb_p = skb2; - kfree_skb(skb1); - skb1 = skb2; - } - elt++; - *trailer = skb1; - skb_p = &skb1->next; - } - - return elt; -} - -void *pskb_put(struct sk_buff *skb, struct sk_buff *tail, int len) -{ - if (tail != skb) { - skb->data_len += len; - skb->len += len; - } - return skb_put(tail, len); -} +#define MAX_SG_ONSTACK 4 int esp_output(struct sk_buff *skb) { @@ -575,7 +271,7 @@ skb->h.icmph->code != ICMP_FRAG_NEEDED) return; - x = xfrm_state_lookup(iph->daddr, esph->spi, IPPROTO_ESP); + x = xfrm4_state_lookup(iph->daddr, esph->spi, IPPROTO_ESP); if (!x) return; printk(KERN_DEBUG "pmtu discvovery on SA ESP/%08x/%08x\n", diff -ruN -x CVS linux-2.5.63/net/ipv4/route.c linux25/net/ipv4/route.c --- linux-2.5.63/net/ipv4/route.c 2003-02-25 04:06:01.000000000 +0900 +++ linux25/net/ipv4/route.c 2003-03-04 20:38:15.000000000 +0900 @@ -96,6 +96,7 @@ #include #include #include +#include #ifdef CONFIG_SYSCTL #include #endif @@ -2599,6 +2600,13 @@ #endif /* CONFIG_PROC_FS */ #endif /* CONFIG_NET_CLS_ROUTE */ +int xfrm_dst_lookup(struct xfrm_dst **dst, struct flowi *fl) +{ + int err = 0; + err = __ip_route_output_key((struct rtable**)dst, fl); + return err; +} + int __init ip_rt_init(void) { int i, order, goal, rc = 0; @@ -2680,6 +2688,7 @@ ip_rt_gc_interval; add_timer(&rt_periodic_timer); + xfrm_dst_lookup_register(xfrm_dst_lookup, AF_INET); #ifdef CONFIG_PROC_FS if (rt_cache_proc_init()) goto out_enomem; diff -ruN -x CVS linux-2.5.63/net/ipv4/xfrm_algo.c linux25/net/ipv4/xfrm_algo.c --- linux-2.5.63/net/ipv4/xfrm_algo.c 2003-02-25 04:05:16.000000000 +0900 +++ linux25/net/ipv4/xfrm_algo.c 2003-03-04 20:38:16.000000000 +0900 @@ -8,9 +8,11 @@ * Software Foundation; either version 2 of the License, or (at your option) * any later version. */ +#include #include #include #include +#include /* * Algorithms supported by IPsec. These entries contain properties which @@ -348,3 +350,333 @@ n++; return n; } + +#if defined(CONFIG_INET_AH) || defined(CONFIG_INET_AH_MODULE) || defined(CONFIG_INET6_AH) || defined(CONFIG_INET6_AH_MODULE) +void skb_ah_walk(const struct sk_buff *skb, + struct crypto_tfm *tfm, icv_update_fn_t icv_update) +{ + int offset = 0; + int len = skb->len; + int start = skb->len - skb->data_len; + int i, copy = start - offset; + struct scatterlist sg; + + /* Checksum header. */ + if (copy > 0) { + if (copy > len) + copy = len; + + sg.page = virt_to_page(skb->data + offset); + sg.offset = (unsigned long)(skb->data + offset) % PAGE_SIZE; + sg.length = copy; + + icv_update(tfm, &sg, 1); + + if ((len -= copy) == 0) + return; + offset += copy; + } + + for (i = 0; i < skb_shinfo(skb)->nr_frags; i++) { + int end; + + BUG_TRAP(start <= offset + len); + + end = start + skb_shinfo(skb)->frags[i].size; + if ((copy = end - offset) > 0) { + skb_frag_t *frag = &skb_shinfo(skb)->frags[i]; + + if (copy > len) + copy = len; + + sg.page = frag->page; + sg.offset = frag->page_offset + offset-start; + sg.length = copy; + + icv_update(tfm, &sg, 1); + + if (!(len -= copy)) + return; + offset += copy; + } + start = end; + } + + if (skb_shinfo(skb)->frag_list) { + struct sk_buff *list = skb_shinfo(skb)->frag_list; + + for (; list; list = list->next) { + int end; + + BUG_TRAP(start <= offset + len); + + end = start + list->len; + if ((copy = end - offset) > 0) { + if (copy > len) + copy = len; + skb_ah_walk(list, tfm, icv_update); + if ((len -= copy) == 0) + return; + offset += copy; + } + start = end; + } + } + if (len) + BUG(); +} +#endif + +#if defined(CONFIG_INET_ESP) || defined(CONFIG_INET_ESP_MODULE) || defined(CONFIG_INET6_ESP) || defined(CONFIG_INET6_ESP_MODULE) +/* Move to common area: it is shared with AH. */ + +void skb_icv_walk(const struct sk_buff *skb, struct crypto_tfm *tfm, + int offset, int len, icv_update_fn_t icv_update) +{ + int start = skb->len - skb->data_len; + int i, copy = start - offset; + struct scatterlist sg; + + /* Checksum header. */ + if (copy > 0) { + if (copy > len) + copy = len; + + sg.page = virt_to_page(skb->data + offset); + sg.offset = (unsigned long)(skb->data + offset) % PAGE_SIZE; + sg.length = copy; + + icv_update(tfm, &sg, 1); + + if ((len -= copy) == 0) + return; + offset += copy; + } + + for (i = 0; i < skb_shinfo(skb)->nr_frags; i++) { + int end; + + BUG_TRAP(start <= offset + len); + + end = start + skb_shinfo(skb)->frags[i].size; + if ((copy = end - offset) > 0) { + skb_frag_t *frag = &skb_shinfo(skb)->frags[i]; + + if (copy > len) + copy = len; + + sg.page = frag->page; + sg.offset = frag->page_offset + offset-start; + sg.length = copy; + + icv_update(tfm, &sg, 1); + + if (!(len -= copy)) + return; + offset += copy; + } + start = end; + } + + if (skb_shinfo(skb)->frag_list) { + struct sk_buff *list = skb_shinfo(skb)->frag_list; + + for (; list; list = list->next) { + int end; + + BUG_TRAP(start <= offset + len); + + end = start + list->len; + if ((copy = end - offset) > 0) { + if (copy > len) + copy = len; + skb_icv_walk(list, tfm, offset-start, copy, icv_update); + if ((len -= copy) == 0) + return; + offset += copy; + } + start = end; + } + } + if (len) + BUG(); +} + + +/* Looking generic it is not used in another places. */ + +int +skb_to_sgvec(struct sk_buff *skb, struct scatterlist *sg, int offset, int len) +{ + int start = skb->len - skb->data_len; + int i, copy = start - offset; + int elt = 0; + + if (copy > 0) { + if (copy > len) + copy = len; + sg[elt].page = virt_to_page(skb->data + offset); + sg[elt].offset = (unsigned long)(skb->data + offset) % PAGE_SIZE; + sg[elt].length = copy; + elt++; + if ((len -= copy) == 0) + return elt; + offset += copy; + } + + for (i = 0; i < skb_shinfo(skb)->nr_frags; i++) { + int end; + + BUG_TRAP(start <= offset + len); + + end = start + skb_shinfo(skb)->frags[i].size; + if ((copy = end - offset) > 0) { + skb_frag_t *frag = &skb_shinfo(skb)->frags[i]; + + if (copy > len) + copy = len; + sg[elt].page = frag->page; + sg[elt].offset = frag->page_offset+offset-start; + sg[elt].length = copy; + elt++; + if (!(len -= copy)) + return elt; + offset += copy; + } + start = end; + } + + if (skb_shinfo(skb)->frag_list) { + struct sk_buff *list = skb_shinfo(skb)->frag_list; + + for (; list; list = list->next) { + int end; + + BUG_TRAP(start <= offset + len); + + end = start + list->len; + if ((copy = end - offset) > 0) { + if (copy > len) + copy = len; + elt += skb_to_sgvec(list, sg+elt, offset - start, copy); + if ((len -= copy) == 0) + return elt; + offset += copy; + } + start = end; + } + } + if (len) + BUG(); + return elt; +} + +/* Check that skb data bits are writable. If they are not, copy data + * to newly created private area. If "tailbits" is given, make sure that + * tailbits bytes beyond current end of skb are writable. + * + * Returns amount of elements of scatterlist to load for subsequent + * transformations and pointer to writable trailer skb. + */ + +int skb_cow_data(struct sk_buff *skb, int tailbits, struct sk_buff **trailer) +{ + int copyflag; + int elt; + struct sk_buff *skb1, **skb_p; + + /* If skb is cloned or its head is paged, reallocate + * head pulling out all the pages (pages are considered not writable + * at the moment even if they are anonymous). + */ + if ((skb_cloned(skb) || skb_shinfo(skb)->nr_frags) && + __pskb_pull_tail(skb, skb_pagelen(skb)-skb_headlen(skb)) == NULL) + return -ENOMEM; + + /* Easy case. Most of packets will go this way. */ + if (!skb_shinfo(skb)->frag_list) { + /* A little of trouble, not enough of space for trailer. + * This should not happen, when stack is tuned to generate + * good frames. OK, on miss we reallocate and reserve even more + * space, 128 bytes is fair. */ + + if (skb_tailroom(skb) < tailbits && + pskb_expand_head(skb, 0, tailbits-skb_tailroom(skb)+128, GFP_ATOMIC)) + return -ENOMEM; + + /* Voila! */ + *trailer = skb; + return 1; + } + + /* Misery. We are in troubles, going to mincer fragments... */ + + elt = 1; + skb_p = &skb_shinfo(skb)->frag_list; + copyflag = 0; + + while ((skb1 = *skb_p) != NULL) { + int ntail = 0; + + /* The fragment is partially pulled by someone, + * this can happen on input. Copy it and everything + * after it. */ + + if (skb_shared(skb1)) + copyflag = 1; + + /* If the skb is the last, worry about trailer. */ + + if (skb1->next == NULL && tailbits) { + if (skb_shinfo(skb1)->nr_frags || + skb_shinfo(skb1)->frag_list || + skb_tailroom(skb1) < tailbits) + ntail = tailbits + 128; + } + + if (copyflag || + skb_cloned(skb1) || + ntail || + skb_shinfo(skb1)->nr_frags || + skb_shinfo(skb1)->frag_list) { + struct sk_buff *skb2; + + /* Fuck, we are miserable poor guys... */ + if (ntail == 0) + skb2 = skb_copy(skb1, GFP_ATOMIC); + else + skb2 = skb_copy_expand(skb1, + skb_headroom(skb1), + ntail, + GFP_ATOMIC); + if (unlikely(skb2 == NULL)) + return -ENOMEM; + + if (skb1->sk) + skb_set_owner_w(skb, skb1->sk); + + /* Looking around. Are we still alive? + * OK, link new skb, drop old one */ + + skb2->next = skb1->next; + *skb_p = skb2; + kfree_skb(skb1); + skb1 = skb2; + } + elt++; + *trailer = skb1; + skb_p = &skb1->next; + } + + return elt; +} + +void *pskb_put(struct sk_buff *skb, struct sk_buff *tail, int len) +{ + if (tail != skb) { + skb->data_len += len; + skb->len += len; + } + return skb_put(tail, len); +} +#endif diff -ruN -x CVS linux-2.5.63/net/ipv4/xfrm_input.c linux25/net/ipv4/xfrm_input.c --- linux-2.5.63/net/ipv4/xfrm_input.c 2003-02-25 04:05:05.000000000 +0900 +++ linux25/net/ipv4/xfrm_input.c 2003-03-05 17:49:52.000000000 +0900 @@ -1,4 +1,14 @@ +/* Changes + * + * Mitsuru KANDA @USAGI : IPv6 Support + * Kazunori MIYAZAWA @USAGI : + * YOSHIFUJI Hideaki @USAGI : + * Kunihiro Ishiguro : + * + */ + #include +#include #include static kmem_cache_t *secpath_cachep; @@ -64,7 +74,7 @@ if (xfrm_nr == XFRM_MAX_DEPTH) goto drop; - x = xfrm_state_lookup(iph->daddr, spi, iph->protocol); + x = xfrm4_state_lookup(iph->daddr, spi, iph->protocol); if (x == NULL) goto drop; @@ -157,3 +167,288 @@ if (!secpath_cachep) panic("IP: failed to allocate secpath_cache\n"); } + +#if defined (CONFIG_IPV6) || defined (CONFIG_IPV6_MODULE) + +/* Fetch spi and seq frpm ipsec header */ + +static int xfrm6_parse_spi(struct sk_buff *skb, u8 nexthdr, u32 *spi, u32 *seq) +{ + int offset, offset_seq; + + switch (nexthdr) { + case IPPROTO_AH: + offset = offsetof(struct ip_auth_hdr, spi); + offset_seq = offsetof(struct ip_auth_hdr, seq_no); + break; + case IPPROTO_ESP: + offset = offsetof(struct ip_esp_hdr, spi); + offset_seq = offsetof(struct ip_esp_hdr, seq_no); + break; + case IPPROTO_COMP: + if (!pskb_may_pull(skb, 4)) + return -EINVAL; + *spi = *(u16*)(skb->h.raw + 2); + *seq = 0; + return 0; + default: + return 1; + } + + if (!pskb_may_pull(skb, 16)) + return -EINVAL; + + *spi = *(u32*)(skb->h.raw + offset); + *seq = *(u32*)(skb->h.raw + offset_seq); + return 0; +} + +static int zero_out_mutable_opts(struct ipv6_opt_hdr *opthdr) +{ + u8 *opt = (u8 *)opthdr; + int len = ipv6_optlen(opthdr); + int off = 0; + int optlen = 0; + + off += 2; + len -= 2; + + while (len > 0) { + + switch (opt[off]) { + + case IPV6_TLV_PAD0: + optlen = 1; + break; + default: + if (len < 2) + goto bad; + optlen = opt[off+1]+2; + if (len < optlen) + goto bad; + if (opt[off] & 0x20) + memset(&opt[off+2], 0, opt[off+1]); + break; + } + + off += optlen; + len -= optlen; + } + if (len == 0) + return 1; + +bad: + return 0; +} + +int xfrm6_clear_mutable_options(struct sk_buff *skb, u16 *nh_offset, int dir) +{ + u16 offset = sizeof(struct ipv6hdr); + struct ipv6_opt_hdr *exthdr = (struct ipv6_opt_hdr*)(skb->nh.raw + offset); + unsigned int packet_len = skb->tail - skb->nh.raw; + u8 nexthdr = skb->nh.ipv6h->nexthdr; + u8 nextnexthdr = 0; + + *nh_offset = ((unsigned char *)&skb->nh.ipv6h->nexthdr) - skb->nh.raw; + + while (offset + 1 <= packet_len) { + + switch (nexthdr) { + + case NEXTHDR_HOP: + *nh_offset = offset; + offset += ipv6_optlen(exthdr); + if (!zero_out_mutable_opts(exthdr)) { + if (net_ratelimit()) + printk(KERN_WARNING "overrun hopopts\n"); + return 0; + } + nexthdr = exthdr->nexthdr; + exthdr = (struct ipv6_opt_hdr*)(skb->nh.raw + offset); + break; + + case NEXTHDR_ROUTING: + *nh_offset = offset; + offset += ipv6_optlen(exthdr); + ((struct ipv6_rt_hdr*)exthdr)->segments_left = 0; + nexthdr = exthdr->nexthdr; + exthdr = (struct ipv6_opt_hdr*)(skb->nh.raw + offset); + break; + + case NEXTHDR_DEST: + *nh_offset = offset; + offset += ipv6_optlen(exthdr); + if (!zero_out_mutable_opts(exthdr)) { + if (net_ratelimit()) + printk(KERN_WARNING "overrun destopt\n"); + return 0; + } + nexthdr = exthdr->nexthdr; + exthdr = (struct ipv6_opt_hdr*)(skb->nh.raw + offset); + break; + + case NEXTHDR_AUTH: + if (dir == XFRM_POLICY_OUT) { + memset(((struct ipv6_auth_hdr*)exthdr)->auth_data, 0, + (((struct ipv6_auth_hdr*)exthdr)->hdrlen - 1) << 2); + } + if (exthdr->nexthdr == NEXTHDR_DEST) { + offset += (((struct ipv6_auth_hdr*)exthdr)->hdrlen + 2) << 2; + exthdr = (struct ipv6_opt_hdr*)(skb->nh.raw + offset); + nextnexthdr = exthdr->nexthdr; + if (!zero_out_mutable_opts(exthdr)) { + if (net_ratelimit()) + printk(KERN_WARNING "overrun destopt\n"); + return 0; + } + } + return nexthdr; + default : + return nexthdr; + } + } + + return nexthdr; +} + +int xfrm6_rcv(struct sk_buff *skb) +{ + int err; + u32 spi, seq; + struct xfrm_state *xfrm_vec[XFRM_MAX_DEPTH]; + struct xfrm_state *x; + int xfrm_nr = 0; + int decaps = 0; + struct ipv6hdr *hdr = skb->nh.ipv6h; + unsigned char *tmp_hdr = NULL; + int hdr_len = 0; + u16 nh_offset = 0; + u8 nexthdr = 0; + + if (hdr->nexthdr == IPPROTO_AH || hdr->nexthdr == IPPROTO_ESP) { + nh_offset = ((unsigned char*)&skb->nh.ipv6h->nexthdr) - skb->nh.raw; + hdr_len = sizeof(struct ipv6hdr); + } else { + hdr_len = skb->h.raw - skb->nh.raw; + } + + tmp_hdr = kmalloc(hdr_len, GFP_ATOMIC); + if (!tmp_hdr) + goto drop; + memcpy(tmp_hdr, skb->nh.raw, hdr_len); + + nexthdr = xfrm6_clear_mutable_options(skb, &nh_offset, XFRM_POLICY_IN); + hdr->priority = 0; + hdr->flow_lbl[0] = 0; + hdr->flow_lbl[1] = 0; + hdr->flow_lbl[2] = 0; + hdr->hop_limit = 0; + + if ((err = xfrm6_parse_spi(skb, nexthdr, &spi, &seq)) != 0) + goto drop; + + do { + struct ipv6hdr *iph = skb->nh.ipv6h; + + if (xfrm_nr == XFRM_MAX_DEPTH) + goto drop; + + x = xfrm6_state_lookup(&iph->daddr, spi, nexthdr); + if (x == NULL) + goto drop; + spin_lock(&x->lock); + if (unlikely(x->km.state != XFRM_STATE_VALID)) + goto drop_unlock; + + if (x->props.replay_window && xfrm_replay_check(x, seq)) + goto drop_unlock; + + nexthdr = x->type->input(x, skb); + if (nexthdr <= 0) + goto drop_unlock; + + if (x->props.replay_window) + xfrm_replay_advance(x, seq); + + x->curlft.bytes += skb->len; + x->curlft.packets++; + + spin_unlock(&x->lock); + + xfrm_vec[xfrm_nr++] = x; + + iph = skb->nh.ipv6h; /* ??? */ + + if (nexthdr == NEXTHDR_DEST) { + if (!pskb_may_pull(skb, (skb->h.raw-skb->data)+8) || + !pskb_may_pull(skb, (skb->h.raw-skb->data)+((skb->h.raw[1]+1)<<3))) { + err = -EINVAL; + goto drop; + } + nexthdr = skb->h.raw[0]; + nh_offset = skb->h.raw - skb->nh.raw; + skb_pull(skb, (skb->h.raw[1]+1)<<3); + skb->h.raw = skb->data; + } + + if (x->props.mode) { /* XXX */ + if (iph->nexthdr != IPPROTO_IPV6) + goto drop; + skb->nh.raw = skb->data; + iph = skb->nh.ipv6h; + decaps = 1; + break; + } + + if ((err = xfrm6_parse_spi(skb, nexthdr, &spi, &seq)) < 0) + goto drop; + } while (!err); + + memcpy(skb->nh.raw, tmp_hdr, hdr_len); + skb->nh.raw[nh_offset] = nexthdr; + skb->nh.ipv6h->payload_len = htons(hdr_len + skb->len - sizeof(struct ipv6hdr)); + + /* Allocate new secpath or COW existing one. */ + if (!skb->sp || atomic_read(&skb->sp->refcnt) != 1) { + struct sec_path *sp; + sp = kmem_cache_alloc(secpath_cachep, SLAB_ATOMIC); + if (!sp) + goto drop; + if (skb->sp) { + memcpy(sp, skb->sp, sizeof(struct sec_path)); + secpath_put(skb->sp); + } else + sp->len = 0; + atomic_set(&sp->refcnt, 1); + skb->sp = sp; + } + + if (xfrm_nr + skb->sp->len > XFRM_MAX_DEPTH) + goto drop; + + memcpy(skb->sp->xvec+skb->sp->len, xfrm_vec, xfrm_nr*sizeof(void*)); + skb->sp->len += xfrm_nr; + + if (decaps) { + if (!(skb->dev->flags&IFF_LOOPBACK)) { + dst_release(skb->dst); + skb->dst = NULL; + } + netif_rx(skb); + return 0; + } else { + return -nexthdr; + } + +drop_unlock: + spin_unlock(&x->lock); + xfrm_state_put(x); +drop: + if (tmp_hdr) kfree(tmp_hdr); + while (--xfrm_nr >= 0) + xfrm_state_put(xfrm_vec[xfrm_nr]); + kfree_skb(skb); + return 0; +} + +#endif /* CONFIG_IPV6 || CONFIG_IPV6_MODULE */ diff -ruN -x CVS linux-2.5.63/net/ipv4/xfrm_policy.c linux25/net/ipv4/xfrm_policy.c --- linux-2.5.63/net/ipv4/xfrm_policy.c 2003-02-25 04:05:32.000000000 +0900 +++ linux25/net/ipv4/xfrm_policy.c 2003-03-05 17:49:52.000000000 +0900 @@ -1,6 +1,16 @@ +/* Changes + * + * Mitsuru KANDA @USAGI : IPv6 Support + * Kazunori MIYAZAWA @USAGI : + * Kunihiro Ishiguro : + * + */ + #include #include #include +#include +#include DECLARE_MUTEX(xfrm_cfg_sem); @@ -10,6 +20,11 @@ struct xfrm_policy *xfrm_policy_list[XFRM_POLICY_MAX*2]; extern struct dst_ops xfrm4_dst_ops; +#if defined (CONFIG_IPV6) || defined (CONFIG_IPV6_MODULE) +extern struct dst_ops xfrm6_dst_ops; +#endif + +static inline int xfrm_dst_lookup(struct xfrm_dst **dst, struct flowi *fl, unsigned short family); /* Limited flow cache. Its function now is to accelerate search for * policy rules. @@ -48,6 +63,24 @@ return hash & (FLOWCACHE_HASH_SIZE-1); } +#if defined (CONFIG_IPV6) || defined (CONFIG_IPV6_MODULE) +static inline u32 flow_hash6(struct flowi *fl) +{ + u32 hash = fl->fl6_src->s6_addr32[2] ^ + fl->fl6_src->s6_addr32[3] ^ + fl->uli_u.ports.sport; + + hash = ((hash & 0xF0F0F0F0) >> 4) | ((hash & 0x0F0F0F0F) << 4); + + hash ^= fl->fl6_dst->s6_addr32[2] ^ + fl->fl6_dst->s6_addr32[3] ^ + fl->uli_u.ports.dport; + hash ^= (hash >> 10); + hash ^= (hash >> 20); + return hash & (FLOWCACHE_HASH_SIZE-1); +} +#endif + static int flow_lwm = 2*FLOWCACHE_HASH_SIZE; static int flow_hwm = 4*FLOWCACHE_HASH_SIZE; @@ -77,13 +110,27 @@ } } -struct xfrm_policy *flow_lookup(int dir, struct flowi *fl) +struct xfrm_policy *flow_lookup(int dir, struct flowi *fl, + unsigned short family) { - struct xfrm_policy *pol; + struct xfrm_policy *pol = NULL; struct flow_entry *fle; - u32 hash = flow_hash(fl); + u32 hash; int cpu; + switch (family) { + case AF_INET: + hash = flow_hash(fl); + break; +#if defined (CONFIG_IPV6) || defined (CONFIG_IPV6_MODULE) + case AF_INET6: + hash = flow_hash6(fl); + break; +#endif + default: + return NULL; + } + local_bh_disable(); cpu = smp_processor_id(); @@ -101,7 +148,7 @@ } } - pol = xfrm_policy_lookup(dir, fl); + pol = xfrm_policy_lookup(dir, fl, family); if (fle) { /* Stale flow entry found. Update it. */ @@ -199,6 +246,46 @@ return type; } +static xfrm_dst_lookup_t *__xfrm_dst_lookup[AF_MAX]; +rwlock_t xdl_lock = RW_LOCK_UNLOCKED; + +int xfrm_dst_lookup_register(xfrm_dst_lookup_t *dst_lookup, + unsigned short family) +{ + int err = 0; + + write_lock(&xdl_lock); + if (__xfrm_dst_lookup[family]) + err = -ENOBUFS; + else { + __xfrm_dst_lookup[family] = dst_lookup; + } + write_unlock(&xdl_lock); + + return err; +} + +void xfrm_dst_lookup_unregister(unsigned short family) +{ + write_lock(&xdl_lock); + if (__xfrm_dst_lookup[family]) + __xfrm_dst_lookup[family] = 0; + write_unlock(&xdl_lock); +} + +static inline int xfrm_dst_lookup(struct xfrm_dst **dst, struct flowi *fl, + unsigned short family) +{ + int err = 0; + read_lock(&xdl_lock); + if (__xfrm_dst_lookup[family]) + err = __xfrm_dst_lookup[family](dst, fl); + else + err = -EINVAL; + read_unlock(&xdl_lock); + return err; +} + #if defined(CONFIG_IPV6) || defined(CONFIG_IPV6_MODULE) static struct xfrm_type *xfrm6_type_map[256]; static rwlock_t xfrm6_type_lock = RW_LOCK_UNLOCKED; @@ -506,15 +593,32 @@ /* Find policy to apply to this flow. */ -struct xfrm_policy *xfrm_policy_lookup(int dir, struct flowi *fl) +struct xfrm_policy *xfrm_policy_lookup(int dir, struct flowi *fl, + unsigned short family) { struct xfrm_policy *pol; read_lock_bh(&xfrm_policy_lock); for (pol = xfrm_policy_list[dir]; pol; pol = pol->next) { struct xfrm_selector *sel = &pol->selector; + int match; + + if (pol->family != family) + continue; - if (xfrm4_selector_match(sel, fl)) { + switch (family) { + case AF_INET: + match = xfrm4_selector_match(sel, fl); + break; +#if defined (CONFIG_IPV6) || defined (CONFIG_IPV6_MODULE) + case AF_INET6: + match = xfrm6_selector_match(sel, fl); + break; +#endif + default: + match = 0; + } + if (match) { atomic_inc(&pol->refcnt); break; } @@ -529,7 +633,21 @@ read_lock_bh(&xfrm_policy_lock); if ((pol = sk->policy[dir]) != NULL) { - if (xfrm4_selector_match(&pol->selector, fl)) + int match; + + switch (sk->family) { + case AF_INET: + match = xfrm4_selector_match(&pol->selector, fl); + break; +#if defined (CONFIG_IPV6) || defined (CONFIG_IPV6_MODULE) + case AF_INET6: + match = xfrm6_selector_match(&pol->selector, fl); + break; +#endif + default: + match = 0; + } + if (match) atomic_inc(&pol->refcnt); else pol = NULL; @@ -630,8 +748,8 @@ /* Resolve list of templates for the flow, given policy. */ static int -xfrm_tmpl_resolve(struct xfrm_policy *policy, struct flowi *fl, - struct xfrm_state **xfrm) +xfrm4_tmpl_resolve(struct xfrm_policy *policy, struct flowi *fl, + struct xfrm_state **xfrm) { int nx; int i, error; @@ -649,7 +767,53 @@ local = tmpl->saddr.xfrm4_addr; } - x = xfrm_state_find(remote, local, fl, tmpl, policy, &error); + x = xfrm4_state_find(remote, local, fl, tmpl, policy, &error); + + if (x && x->km.state == XFRM_STATE_VALID) { + xfrm[nx++] = x; + daddr = remote; + saddr = local; + continue; + } + if (x) { + error = (x->km.state == XFRM_STATE_ERROR ? + -EINVAL : -EAGAIN); + xfrm_state_put(x); + } + + if (!tmpl->optional) + goto fail; + } + return nx; + +fail: + for (nx--; nx>=0; nx--) + xfrm_state_put(xfrm[nx]); + return error; +} + +#if defined (CONFIG_IPV6) || defined (CONFIG_IPV6_MODULE) +static int +xfrm6_tmpl_resolve(struct xfrm_policy *policy, struct flowi *fl, + struct xfrm_state **xfrm) +{ + int nx; + int i, error; + struct in6_addr *daddr = fl->fl6_dst; + struct in6_addr *saddr = fl->fl6_src; + + for (nx=0, i = 0; i < policy->xfrm_nr; i++) { + struct xfrm_state *x=NULL; + struct in6_addr *remote = daddr; + struct in6_addr *local = saddr; + struct xfrm_tmpl *tmpl = &policy->xfrm_vec[i]; + + if (tmpl->mode) { + remote = (struct in6_addr*)&tmpl->id.daddr; + local = (struct in6_addr*)&tmpl->saddr; + } + + x = xfrm6_state_find(remote, local, fl, tmpl, policy, &error); if (x && x->km.state == XFRM_STATE_VALID) { xfrm[nx++] = x; @@ -673,6 +837,7 @@ xfrm_state_put(xfrm[nx]); return error; } +#endif /* Check that the bundle accepts the flow and its components are * still valid. @@ -694,6 +859,24 @@ return 0; } +#if defined(CONFIG_IPV6) || defined(CONFIG_IPV6_MODULE) +static int xfrm6_bundle_ok(struct xfrm_dst *xdst, struct flowi *fl) +{ + do { + if (xdst->u.dst.ops != &xfrm6_dst_ops) + return 1; + + if (!xfrm6_selector_match(&xdst->u.dst.xfrm->sel, fl)) + return 0; + if (xdst->u.dst.xfrm->km.state != XFRM_STATE_VALID || + xdst->u.dst.path->obsolete > 0) + return 0; + xdst = (struct xfrm_dst*)xdst->u.dst.child; + } while (xdst); + return 0; +} +#endif + /* Allocate chain of dst_entry's, attach known xfrm's, calculate * all the metrics... Shortly, bundle a bundle. @@ -744,7 +927,7 @@ .saddr = local } } }; - err = __ip_route_output_key(&rt, &fl_tunnel); + err = xfrm_dst_lookup((struct xfrm_dst**)&rt, &fl_tunnel, AF_INET); if (err) goto error; } else { @@ -791,6 +974,97 @@ return err; } +#if defined(CONFIG_IPV6) || defined(CONFIG_IPV6_MODULE) +static int +xfrm6_bundle_create(struct xfrm_policy *policy, struct xfrm_state **xfrm, int nx, + struct flowi *fl, struct dst_entry **dst_p) +{ + struct dst_entry *dst, *dst_prev; + struct rt6_info *rt0 = (struct rt6_info*)(*dst_p); + struct rt6_info *rt = rt0; + struct in6_addr *remote = fl->fl6_dst; + struct in6_addr *local = fl->fl6_src; + int i; + int err = 0; + int header_len = 0; + + dst = dst_prev = NULL; + + for (i = 0; i < nx; i++) { + struct dst_entry *dst1 = dst_alloc(&xfrm6_dst_ops); + + if (unlikely(dst1 == NULL)) { + err = -ENOBUFS; + goto error; + } + + dst1->xfrm = xfrm[i]; + if (!dst) + dst = dst1; + else { + dst_prev->child = dst1; + dst1->flags |= DST_NOHASH; + dst_clone(dst1); + } + dst_prev = dst1; + if (xfrm[i]->props.mode) { + remote = (struct in6_addr*)&xfrm[i]->id.daddr; + local = (struct in6_addr*)&xfrm[i]->props.saddr; + } + header_len += xfrm[i]->props.header_len; + } + + if (ipv6_addr_cmp(remote, fl->fl6_dst)) { + struct flowi fl_tunnel = { .nl_u = { .ip6_u = + { .daddr = remote, + .saddr = local } + } + }; + err = xfrm_dst_lookup((struct xfrm_dst**)&dst, &fl_tunnel, AF_INET6); + if (err) + goto error; + } else { + dst_clone(&rt->u.dst); + } + dst_prev->child = &rt->u.dst; + for (dst_prev = dst; dst_prev != &rt->u.dst; dst_prev = dst_prev->child) { + struct xfrm_dst *x = (struct xfrm_dst*)dst_prev; + x->u.rt.fl = *fl; + + dst_prev->dev = rt->u.dst.dev; + if (rt->u.dst.dev) + dev_hold(rt->u.dst.dev); + dst_prev->obsolete = -1; + dst_prev->flags |= DST_HOST; + dst_prev->lastuse = jiffies; + dst_prev->header_len = header_len; + memcpy(&dst_prev->metrics, &rt->u.dst.metrics, sizeof(dst_prev->metrics)); + dst_prev->path = &rt->u.dst; + + /* Copy neighbout for reachability confirmation */ + dst_prev->neighbour = neigh_clone(rt->u.dst.neighbour); + dst_prev->input = rt->u.dst.input; + dst_prev->output = dst_prev->xfrm->type->output; + /* Sheit... I remember I did this right. Apparently, + * it was magically lost, so this code needs audit */ + x->u.rt6.rt6i_flags = rt0->rt6i_flags&(RTCF_BROADCAST|RTCF_MULTICAST|RTCF_LOCAL); + x->u.rt6.rt6i_metric = rt0->rt6i_metric; + x->u.rt6.rt6i_node = rt0->rt6i_node; + x->u.rt6.rt6i_hoplimit = rt0->rt6i_hoplimit; + x->u.rt6.rt6i_gateway = rt0->rt6i_gateway; + memcpy(&x->u.rt6.rt6i_gateway, &rt0->rt6i_gateway, sizeof(x->u.rt6.rt6i_gateway)); + header_len -= x->u.dst.xfrm->props.header_len; + } + *dst_p = dst; + return 0; + +error: + if (dst) + dst_free(dst); + return err; +} +#endif + /* Main function: finds/creates a bundle for given flow. * * At the moment we eat a raw IP route. Mostly to speed up lookups @@ -806,9 +1080,7 @@ int nx = 0; int err; u32 genid; - - fl->oif = rt->u.dst.dev->ifindex; - fl->fl4_src = rt->rt_src; + u16 family = (*dst_p)->ops->family; restart: genid = xfrm_policy_genid; @@ -821,11 +1093,12 @@ if ((rt->u.dst.flags & DST_NOXFRM) || !xfrm_policy_list[XFRM_POLICY_OUT]) return 0; - policy = flow_lookup(XFRM_POLICY_OUT, fl); - if (!policy) - return 0; + policy = flow_lookup(XFRM_POLICY_OUT, fl, family); } + if (!policy) + return 0; + policy->curlft.use_time = (unsigned long)xtime.tv_sec; switch (policy->action) { @@ -846,23 +1119,48 @@ * LATER: help from flow cache. It is optional, this * is required only for output policy. */ - read_lock_bh(&policy->lock); - for (dst = policy->bundles; dst; dst = dst->next) { - struct xfrm_dst *xdst = (struct xfrm_dst*)dst; - if (xdst->u.rt.fl.fl4_dst == fl->fl4_dst && - xdst->u.rt.fl.fl4_src == fl->fl4_src && - xdst->u.rt.fl.oif == fl->oif && - xfrm_bundle_ok(xdst, fl)) { - dst_clone(dst); + if (family == AF_INET) { + fl->oif = rt->u.dst.dev->ifindex; + fl->fl4_src = rt->rt_src; + read_lock_bh(&policy->lock); + for (dst = policy->bundles; dst; dst = dst->next) { + struct xfrm_dst *xdst = (struct xfrm_dst*)dst; + if (xdst->u.rt.fl.fl4_dst == fl->fl4_dst && + xdst->u.rt.fl.fl4_src == fl->fl4_src && + xdst->u.rt.fl.oif == fl->oif && + xfrm_bundle_ok(xdst, fl)) { + dst_clone(dst); + break; + } + } + read_unlock_bh(&policy->lock); + if (dst) break; + nx = xfrm4_tmpl_resolve(policy, fl, xfrm); +#if defined (CONFIG_IPV6) || defined (CONFIG_IPV6_MODULE) + } else if (family == AF_INET6) { + read_lock_bh(&policy->lock); + for (dst = policy->bundles; dst; dst = dst->next) { + struct xfrm_dst *xdst = (struct xfrm_dst*)dst; + if (!ipv6_addr_cmp(&xdst->u.rt6.rt6i_dst.addr, fl->fl6_dst) && + !ipv6_addr_cmp(&xdst->u.rt6.rt6i_src.addr, fl->fl6_src) && + xfrm6_bundle_ok(xdst, fl)) { + dst_clone(dst); + break; + } } + read_unlock_bh(&policy->lock); + if (dst) + break; + nx = xfrm6_tmpl_resolve(policy, fl, xfrm); +#endif + } else { + return -EINVAL; } - read_unlock_bh(&policy->lock); if (dst) break; - nx = xfrm_tmpl_resolve(policy, fl, xfrm); if (unlikely(nx<0)) { err = nx; if (err == -EAGAIN) { @@ -873,7 +1171,18 @@ __set_task_state(tsk, TASK_INTERRUPTIBLE); add_wait_queue(&km_waitq, &wait); - err = xfrm_tmpl_resolve(policy, fl, xfrm); + switch (family) { + case AF_INET: + err = xfrm4_tmpl_resolve(policy, fl, xfrm); + break; +#if defined (CONFIG_IPV6) || defined (CONFIG_IPV6_MODULE) + case AF_INET6: + err = xfrm6_tmpl_resolve(policy, fl, xfrm); + break; +#endif + default: + err = -EINVAL; + } if (err == -EAGAIN) schedule(); __set_task_state(tsk, TASK_RUNNING); @@ -896,7 +1205,19 @@ } dst = &rt->u.dst; - err = xfrm_bundle_create(policy, xfrm, nx, fl, &dst); + switch (family) { + case AF_INET: + err = xfrm_bundle_create(policy, xfrm, nx, fl, &dst); + break; +#if defined (CONFIG_IPV6) || defined (CONFIG_IPV6_MODULE) + case AF_INET6: + err = xfrm6_bundle_create(policy, xfrm, nx, fl, &dst); + break; +#endif + default: + err = -EINVAL; + } + if (unlikely(err)) { int i; for (i=0; inh.iph; u8 *xprth = skb->nh.raw + iph->ihl*4; @@ -1008,18 +1329,109 @@ fl->fl4_src = iph->saddr; } -int __xfrm_policy_check(struct sock *sk, int dir, struct sk_buff *skb) +#if defined (CONFIG_IPV6) || defined (CONFIG_IPV6_MODULE) +static inline int +xfrm6_state_ok(struct xfrm_tmpl *tmpl, struct xfrm_state *x) +{ + return x->id.proto == tmpl->id.proto && + (x->id.spi == tmpl->id.spi || !tmpl->id.spi) && + x->props.mode == tmpl->mode && + (tmpl->aalgos & (1<props.aalgo)) && + (!x->props.mode || !ipv6_addr_any((struct in6_addr*)&x->props.saddr) || + !ipv6_addr_cmp((struct in6_addr *)&tmpl->saddr, (struct in6_addr*)&x->props.saddr)); +} + +static inline int +xfrm6_policy_ok(struct xfrm_tmpl *tmpl, struct sec_path *sp, int idx) +{ + for (; idx < sp->len; idx++) { + if (xfrm6_state_ok(tmpl, sp->xvec[idx])) + return ++idx; + } + return -1; +} + +static inline void +_decode_session6(struct sk_buff *skb, struct flowi *fl) +{ + u16 offset = sizeof(struct ipv6hdr); + struct ipv6hdr *hdr = skb->nh.ipv6h; + struct ipv6_opt_hdr *exthdr = (struct ipv6_opt_hdr*)(skb->nh.raw + offset); + u8 nexthdr = skb->nh.ipv6h->nexthdr; + + fl->fl6_dst = &hdr->daddr; + fl->fl6_src = &hdr->saddr; + + while (pskb_may_pull(skb, skb->nh.raw + offset + 1 - skb->data)) { + switch (nexthdr) { + case NEXTHDR_ROUTING: + case NEXTHDR_HOP: + case NEXTHDR_DEST: + offset += ipv6_optlen(exthdr); + nexthdr = exthdr->nexthdr; + exthdr = (struct ipv6_opt_hdr*)(skb->nh.raw + offset); + break; + + case IPPROTO_UDP: + case IPPROTO_TCP: + case IPPROTO_SCTP: + if (pskb_may_pull(skb, skb->nh.raw + offset + 4 - skb->data)) { + u16 *ports = (u16 *)exthdr; + + fl->uli_u.ports.sport = ports[0]; + fl->uli_u.ports.dport = ports[1]; + } + return; + + /* XXX Why are there these headers? */ + case IPPROTO_AH: + case IPPROTO_ESP: + default: + fl->uli_u.spi = 0; + return; + }; + } +} +#endif + +int __xfrm_policy_check(struct sock *sk, int dir, struct sk_buff *skb, + unsigned short family) { struct xfrm_policy *pol; struct flowi fl; - _decode_session(skb, &fl); + switch (family) { + case AF_INET: + _decode_session4(skb, &fl); + break; +#if defined (CONFIG_IPV6) || defined (CONFIG_IPV6_MODULE) + case AF_INET6: + _decode_session6(skb, &fl); + break; +#endif + default : + return 0; + } /* First, check used SA against their selectors. */ if (skb->sp) { int i; + for (i=skb->sp->len-1; i>=0; i--) { - if (!xfrm4_selector_match(&skb->sp->xvec[i]->sel, &fl)) + int match; + switch (family) { + case AF_INET: + match = xfrm4_selector_match(&skb->sp->xvec[i]->sel, &fl); + break; +#if defined (CONFIG_IPV6) || defined (CONFIG_IPV6_MODULE) + case AF_INET6: + match = xfrm6_selector_match(&skb->sp->xvec[i]->sel, &fl); + break; +#endif + default: + match = 0; + } + if (!match) return 0; } } @@ -1029,7 +1441,7 @@ pol = xfrm_sk_policy_lookup(sk, dir, &fl); if (!pol) - pol = flow_lookup(dir, &fl); + pol = flow_lookup(dir, &fl, family); if (!pol) return 1; @@ -1050,7 +1462,18 @@ * are implied between each two transformations. */ for (i = pol->xfrm_nr-1, k = 0; i >= 0; i--) { - k = xfrm_policy_ok(pol->xfrm_vec+i, sp, k); + switch (family) { + case AF_INET: + k = xfrm_policy_ok(pol->xfrm_vec+i, sp, k); + break; +#if defined (CONFIG_IPV6) || defined (CONFIG_IPV6_MODULE) + case AF_INET6: + k = xfrm6_policy_ok(pol->xfrm_vec+i, sp, k); + break; +#endif + default: + k = -1; + } if (k < 0) goto reject; } @@ -1064,18 +1487,29 @@ return 0; } -int __xfrm_route_forward(struct sk_buff *skb) +int __xfrm_route_forward(struct sk_buff *skb, unsigned short family) { struct flowi fl; - _decode_session(skb, &fl); + switch (family) { + case AF_INET: + _decode_session4(skb, &fl); + break; +#if defined (CONFIG_IPV6) || defined (CONFIG_IPV6_MODULE) + case AF_INET6: + _decode_session6(skb, &fl); + break; +#endif + default: + return 0; + } return xfrm_lookup(&skb->dst, &fl, NULL, 0) == 0; } /* Optimize later using cookies and generation ids. */ -static struct dst_entry *xfrm4_dst_check(struct dst_entry *dst, u32 cookie) +static struct dst_entry *xfrm_dst_check(struct dst_entry *dst, u32 cookie) { struct dst_entry *child = dst; @@ -1091,19 +1525,19 @@ return dst; } -static void xfrm4_dst_destroy(struct dst_entry *dst) +static void xfrm_dst_destroy(struct dst_entry *dst) { xfrm_state_put(dst->xfrm); dst->xfrm = NULL; } -static void xfrm4_link_failure(struct sk_buff *skb) +static void xfrm_link_failure(struct sk_buff *skb) { /* Impossible. Such dst must be popped before reaches point of failure. */ return; } -static struct dst_entry *xfrm4_negative_advice(struct dst_entry *dst) +static struct dst_entry *xfrm_negative_advice(struct dst_entry *dst) { if (dst) { if (dst->obsolete) { @@ -1114,8 +1548,7 @@ return dst; } - -static int xfrm4_garbage_collect(void) +static void __xfrm_garbage_collect(void) { int i; struct xfrm_policy *pol; @@ -1145,10 +1578,22 @@ gc_list = dst->next; dst_free(dst); } +} +static inline int xfrm4_garbage_collect(void) +{ + __xfrm_garbage_collect(); return (atomic_read(&xfrm4_dst_ops.entries) > xfrm4_dst_ops.gc_thresh*2); } +#if defined(CONFIG_IPV6) || defined(CONFIG_IPV6_MODULE) +static inline int xfrm6_garbage_collect(void) +{ + __xfrm_garbage_collect(); + return (atomic_read(&xfrm6_dst_ops.entries) > xfrm6_dst_ops.gc_thresh*2); +} +#endif + static int bundle_depends_on(struct dst_entry *dst, struct xfrm_state *x) { do { @@ -1192,7 +1637,7 @@ return 0; } - + static void xfrm4_update_pmtu(struct dst_entry *dst, u32 mtu) { struct dst_entry *path = dst->path; @@ -1203,6 +1648,18 @@ path->ops->update_pmtu(path, mtu); } +#if defined (CONFIG_IPV6) || defined (CONFIG_IPV6_MODULE) +static void xfrm6_update_pmtu(struct dst_entry *dst, u32 mtu) +{ + struct dst_entry *path = dst->path; + + if (mtu >= 1280 && mtu < dst_pmtu(dst)) + return; + + path->ops->update_pmtu(path, mtu); +} +#endif + /* Well... that's _TASK_. We need to scan through transformation * list and figure out what mss tcp should generate in order to * final datagram fit to mtu. Mama mia... :-) @@ -1212,7 +1669,7 @@ * * Consider this function as something like dark humour. :-) */ -static int xfrm4_get_mss(struct dst_entry *dst, u32 mtu) +static int xfrm_get_mss(struct dst_entry *dst, u32 mtu) { int res = mtu - dst->header_len; @@ -1247,16 +1704,32 @@ .family = AF_INET, .protocol = __constant_htons(ETH_P_IP), .gc = xfrm4_garbage_collect, - .check = xfrm4_dst_check, - .destroy = xfrm4_dst_destroy, - .negative_advice = xfrm4_negative_advice, - .link_failure = xfrm4_link_failure, + .check = xfrm_dst_check, + .destroy = xfrm_dst_destroy, + .negative_advice = xfrm_negative_advice, + .link_failure = xfrm_link_failure, .update_pmtu = xfrm4_update_pmtu, - .get_mss = xfrm4_get_mss, + .get_mss = xfrm_get_mss, .gc_thresh = 1024, .entry_size = sizeof(struct xfrm_dst), }; +#if defined (CONFIG_IPV6) || defined (CONFIG_IPV6_MODULE) +struct dst_ops xfrm6_dst_ops = { + .family = AF_INET6, + .protocol = __constant_htons(ETH_P_IPV6), + .gc = xfrm6_garbage_collect, + .check = xfrm_dst_check, + .destroy = xfrm_dst_destroy, + .negative_advice = xfrm_negative_advice, + .link_failure = xfrm_link_failure, + .update_pmtu = xfrm6_update_pmtu, + .get_mss = xfrm_get_mss, + .gc_thresh = 1024, + .entry_size = sizeof(struct xfrm_dst), +}; +#endif /* CONFIG_IPV6 || CONFIG_IPV6_MODULE */ + void __init xfrm_init(void) { xfrm4_dst_ops.kmem_cachep = kmem_cache_create("xfrm4_dst_cache", @@ -1267,8 +1740,12 @@ if (!xfrm4_dst_ops.kmem_cachep) panic("IP: failed to allocate xfrm4_dst_cache\n"); - flow_cache_init(); +#if defined (CONFIG_IPV6) || defined (CONFIG_IPV6_MODULE) + xfrm6_dst_ops.kmem_cachep = xfrm4_dst_ops.kmem_cachep; +#endif + flow_cache_init(); xfrm_state_init(); xfrm_input_init(); } + diff -ruN -x CVS linux-2.5.63/net/ipv4/xfrm_state.c linux25/net/ipv4/xfrm_state.c --- linux-2.5.63/net/ipv4/xfrm_state.c 2003-02-25 04:05:38.000000000 +0900 +++ linux25/net/ipv4/xfrm_state.c 2003-03-05 20:17:33.000000000 +0900 @@ -1,3 +1,11 @@ +/* Changes + * + * Mitsuru KANDA @USAGI : IPv6 Support + * Kazunori MIYAZAWA @USAGI : + * Kunihiro Ishiguro : + * + */ + #include #include #include @@ -207,8 +215,8 @@ } struct xfrm_state * -xfrm_state_find(u32 daddr, u32 saddr, struct flowi *fl, struct xfrm_tmpl *tmpl, - struct xfrm_policy *pol, int *err) +xfrm4_state_find(u32 daddr, u32 saddr, struct flowi *fl, struct xfrm_tmpl *tmpl, + struct xfrm_policy *pol, int *err) { unsigned h = ntohl(daddr); struct xfrm_state *x; @@ -290,6 +298,7 @@ x->props.saddr.xfrm4_addr = saddr; x->props.mode = tmpl->mode; x->props.reqid = tmpl->reqid; + x->props.family = AF_INET; if (km_query(x, tmpl, pol) == 0) { x->km.state = XFRM_STATE_ACQ; @@ -318,14 +327,133 @@ return x; } +#if defined(CONFIG_IPV6) || defined(CONFIG_IPV6_MODULE) +struct xfrm_state * +xfrm6_state_find(struct in6_addr *daddr, struct in6_addr *saddr, struct flowi *fl, struct xfrm_tmpl *tmpl, + struct xfrm_policy *pol, int *err) +{ + unsigned h = ntohl(daddr->s6_addr32[2]^daddr->s6_addr32[3]); + struct xfrm_state *x; + int acquire_in_progress = 0; + int error = 0; + struct xfrm_state *best = NULL; + + h = (h ^ (h>>16)) % XFRM_DST_HSIZE; + + spin_lock_bh(&xfrm_state_lock); + list_for_each_entry(x, xfrm_state_bydst+h, bydst) { + if (x->props.family == AF_INET6&& + !ipv6_addr_cmp(daddr, (struct in6_addr *)&x->id.daddr) && + x->props.reqid == tmpl->reqid && + (!ipv6_addr_cmp(saddr, (struct in6_addr *)&x->props.saddr)|| ipv6_addr_any(saddr)) && + tmpl->mode == x->props.mode && + tmpl->id.proto == x->id.proto) { + /* Resolution logic: + 1. There is a valid state with matching selector. + Done. + 2. Valid state with inappropriate selector. Skip. + + Entering area of "sysdeps". + + 3. If state is not valid, selector is temporary, + it selects only session which triggered + previous resolution. Key manager will do + something to install a state with proper + selector. + */ + if (x->km.state == XFRM_STATE_VALID) { + if (!xfrm6_selector_match(&x->sel, fl)) + continue; + if (!best || + best->km.dying > x->km.dying || + (best->km.dying == x->km.dying && + best->curlft.add_time < x->curlft.add_time)) + best = x; + } else if (x->km.state == XFRM_STATE_ACQ) { + acquire_in_progress = 1; + } else if (x->km.state == XFRM_STATE_ERROR || + x->km.state == XFRM_STATE_EXPIRED) { + if (xfrm6_selector_match(&x->sel, fl)) + error = 1; + } + } + } + + if (best) { + atomic_inc(&best->refcnt); + spin_unlock_bh(&xfrm_state_lock); + return best; + } + x = NULL; + if (!error && !acquire_in_progress && + ((x = xfrm_state_alloc()) != NULL)) { + /* Initialize temporary selector matching only + * to current session. */ + memcpy(&x->sel.daddr, fl->fl6_dst, sizeof(struct in6_addr)); + memcpy(&x->sel.saddr, fl->fl6_src, sizeof(struct in6_addr)); + x->sel.dport = fl->uli_u.ports.dport; + x->sel.dport_mask = ~0; + x->sel.sport = fl->uli_u.ports.sport; + x->sel.sport_mask = ~0; + x->sel.prefixlen_d = 128; + x->sel.prefixlen_s = 128; + x->sel.proto = fl->proto; + x->sel.ifindex = fl->oif; + x->id = tmpl->id; + if (ipv6_addr_any((struct in6_addr*)&x->id.daddr)) + memcpy(&x->id.daddr, daddr, sizeof(x->sel.daddr)); + memcpy(&x->props.saddr, &tmpl->saddr, sizeof(x->props.saddr)); + if (ipv6_addr_any((struct in6_addr*)&x->props.saddr)) + memcpy(&x->props.saddr, &saddr, sizeof(x->sel.saddr)); + x->props.mode = tmpl->mode; + x->props.reqid = tmpl->reqid; + x->props.family = AF_INET6; + + if (km_query(x, tmpl, pol) == 0) { + x->km.state = XFRM_STATE_ACQ; + list_add_tail(&x->bydst, xfrm_state_bydst+h); + atomic_inc(&x->refcnt); + if (x->id.spi) { + struct in6_addr *addr = (struct in6_addr*)&x->id.daddr; + h = ntohl((addr->s6_addr32[2]^addr->s6_addr32[3])^x->id.spi^x->id.proto); + h = (h ^ (h>>10) ^ (h>>20)) % XFRM_DST_HSIZE; + list_add(&x->byspi, xfrm_state_byspi+h); + atomic_inc(&x->refcnt); + } + x->lft.hard_add_expires_seconds = ACQ_EXPIRES; + atomic_inc(&x->refcnt); + mod_timer(&x->timer, ACQ_EXPIRES*HZ); + } else { + x->km.state = XFRM_STATE_DEAD; + xfrm_state_put(x); + x = NULL; + error = 1; + } + } + spin_unlock_bh(&xfrm_state_lock); + if (!x) + *err = acquire_in_progress ? -EAGAIN : + (error ? -ESRCH : -ENOMEM); + return x; +} +#endif /* CONFIG_IPV6 || CONFIG_IPV6_MODULE */ + void xfrm_state_insert(struct xfrm_state *x) { unsigned h = 0; - if (x->props.family == AF_INET) + switch (x->props.family) { + case AF_INET: h = ntohl(x->id.daddr.xfrm4_addr); - else if (x->props.family == AF_INET6) + break; +#if defined(CONFIG_IPV6) || defined(CONFIG_IPV6_MODULE) + case AF_INET6: h = ntohl(x->id.daddr.a6[2]^x->id.daddr.a6[3]); + break; +#endif + default: + return; + } h = (h ^ (h>>16)) % XFRM_DST_HSIZE; @@ -384,7 +512,7 @@ } struct xfrm_state * -xfrm_state_lookup(u32 daddr, u32 spi, u8 proto) +xfrm4_state_lookup(u32 daddr, u32 spi, u8 proto) { unsigned h = ntohl(daddr^spi^proto); struct xfrm_state *x; @@ -406,6 +534,31 @@ return NULL; } +#if defined(CONFIG_IPV6) || defined(CONFIG_IPV6_MODULE) +struct xfrm_state * +xfrm6_state_lookup(struct in6_addr *daddr, u32 spi, u8 proto) +{ + unsigned h = ntohl(daddr->s6_addr32[2]^daddr->s6_addr32[3]^spi^proto); + struct xfrm_state *x; + + h = (h ^ (h>>10) ^ (h>>20)) % XFRM_DST_HSIZE; + + spin_lock_bh(&xfrm_state_lock); + list_for_each_entry(x, xfrm_state_byspi+h, byspi) { + if (x->props.family == AF_INET6 && + spi == x->id.spi && + !ipv6_addr_cmp(daddr, (struct in6_addr *)x->id.daddr.a6) && + proto == x->id.proto) { + atomic_inc(&x->refcnt); + spin_unlock_bh(&xfrm_state_lock); + return x; + } + } + spin_unlock_bh(&xfrm_state_lock); + return NULL; +} +#endif + struct xfrm_state * xfrm_find_acq(u8 mode, u16 reqid, u8 proto, u32 daddr, u32 saddr, int create) { @@ -445,7 +598,59 @@ x0->km.state = XFRM_STATE_ACQ; x0->id.daddr.xfrm4_addr = daddr; x0->id.proto = proto; + x0->props.mode = mode; + x0->props.reqid = reqid; x0->props.family = AF_INET; + x0->lft.hard_add_expires_seconds = ACQ_EXPIRES; + atomic_inc(&x0->refcnt); + mod_timer(&x0->timer, jiffies + ACQ_EXPIRES*HZ); + atomic_inc(&x0->refcnt); + list_add_tail(&x0->bydst, xfrm_state_bydst+h); + wake_up(&km_waitq); + } + spin_unlock_bh(&xfrm_state_lock); + return x0; +} + +#if defined(CONFIG_IPV6) || defined(CONFIG_IPV6_MODULE) +struct xfrm_state * +xfrm6_find_acq(u8 mode, u16 reqid, u8 proto, struct in6_addr *daddr, struct in6_addr *saddr, int create) +{ + struct xfrm_state *x, *x0; + unsigned h = ntohl(daddr->s6_addr32[2]^daddr->s6_addr32[3]); + + h = (h ^ (h>>16)) % XFRM_DST_HSIZE; + x0 = NULL; + + spin_lock_bh(&xfrm_state_lock); + list_for_each_entry(x, xfrm_state_bydst+h, bydst) { + if (x->props.family == AF_INET6 && + !ipv6_addr_cmp(daddr, (struct in6_addr *)x->id.daddr.a6) && + mode == x->props.mode && + proto == x->id.proto && + !ipv6_addr_cmp(saddr, (struct in6_addr *)x->props.saddr.a6) && + reqid == x->props.reqid && + x->km.state == XFRM_STATE_ACQ) { + if (!x0) + x0 = x; + if (x->id.spi) + continue; + x0 = x; + break; + } + } + if (x0) { + atomic_inc(&x0->refcnt); + } else if (create && (x0 = xfrm_state_alloc()) != NULL) { + memcpy(x0->sel.daddr.a6, daddr, sizeof(struct in6_addr)); + memcpy(x0->sel.saddr.a6, saddr, sizeof(struct in6_addr)); + x0->sel.prefixlen_d = 128; + x0->sel.prefixlen_s = 128; + memcpy(x0->props.saddr.a6, saddr, sizeof(struct in6_addr)); + x0->km.state = XFRM_STATE_ACQ; + memcpy(x0->id.daddr.a6, daddr, sizeof(struct in6_addr)); + x0->id.proto = proto; + x0->props.family = AF_INET6; x0->props.mode = mode; x0->props.reqid = reqid; x0->lft.hard_add_expires_seconds = ACQ_EXPIRES; @@ -458,6 +663,7 @@ spin_unlock_bh(&xfrm_state_lock); return x0; } +#endif /* Silly enough, but I'm lazy to build resolution list */ @@ -491,7 +697,18 @@ return; if (minspi == maxspi) { - x0 = xfrm_state_lookup(x->id.daddr.xfrm4_addr, minspi, x->id.proto); + switch(x->props.family) { + case AF_INET: + x0 = xfrm4_state_lookup(x->id.daddr.xfrm4_addr, minspi, x->id.proto); + break; +#if defined(CONFIG_IPV6) || defined(CONFIG_IPV6_MODULE) + case AF_INET6: + x0 = xfrm6_state_lookup((struct in6_addr*)x->id.daddr.a6, minspi, x->id.proto); + break; +#endif + default: + x0 = NULL; + } if (x0) { xfrm_state_put(x0); return; @@ -503,7 +720,18 @@ maxspi = ntohl(maxspi); for (h=0; hid.daddr.xfrm4_addr, htonl(spi), x->id.proto); + switch(x->props.family) { + case AF_INET: + x0 = xfrm4_state_lookup(x->id.daddr.xfrm4_addr, minspi, x->id.proto); + break; +#if defined(CONFIG_IPV6) || defined(CONFIG_IPV6_MODULE) + case AF_INET6: + x0 = xfrm6_state_lookup((struct in6_addr*)x->id.daddr.a6, minspi, x->id.proto); + break; +#endif + default: + x0 = NULL; + } if (x0 == NULL) break; xfrm_state_put(x0); @@ -512,7 +740,18 @@ } if (x->id.spi) { spin_lock_bh(&xfrm_state_lock); - h = ntohl(x->id.daddr.xfrm4_addr^x->id.spi^x->id.proto); + switch(x->props.family) { + case AF_INET: + h = ntohl(x->id.daddr.xfrm4_addr^x->id.spi^x->id.proto); + break; +#if defined(CONFIG_IPV6) || defined(CONFIG_IPV6_MODULE) + case AF_INET6: + h = ntohl(x->id.daddr.a6[2]^x->id.daddr.a6[3]^x->id.spi^x->id.proto); + break; +#endif + default: + h = 0; /* XXX */ + } h = (h ^ (h>>10) ^ (h>>20)) % XFRM_DST_HSIZE; list_add(&x->byspi, xfrm_state_byspi+h); atomic_inc(&x->refcnt); @@ -605,14 +844,21 @@ int i; for (i=0; iprops.family == AF_INET && - !xfrm4_selector_match(&x[i]->sel, fl)) - return -EINVAL; + int match; + switch(x[i]->props.family) { + case AF_INET: + match = xfrm4_selector_match(&x[i]->sel, fl); + break; #if defined(CONFIG_IPV6) || defined(CONFIG_IPV6_MODULE) - if (x[i]->props.family == AF_INET6 && - !xfrm6_selector_match(&x[i]->sel, fl)) - return -EINVAL; + case AF_INET6: + match = xfrm6_selector_match(&x[i]->sel, fl); + break; #endif + default: + match = 0; + } + if (!match) + return -EINVAL; } return 0; } @@ -722,118 +968,3 @@ } } -#if defined(CONFIG_IPV6) || defined(CONFIG_IPV6_MODULE) -struct xfrm_state * -xfrm6_state_lookup(struct in6_addr *daddr, u32 spi, u8 proto) -{ - unsigned h = ntohl(daddr->s6_addr32[2]^daddr->s6_addr32[3]^spi^proto); - struct xfrm_state *x; - - h = (h ^ (h>>10) ^ (h>>20)) % XFRM_DST_HSIZE; - - spin_lock_bh(&xfrm_state_lock); - list_for_each_entry(x, xfrm_state_byspi+h, byspi) { - if (x->props.family == AF_INET6 && - spi == x->id.spi && - !ipv6_addr_cmp(daddr, (struct in6_addr *)x->id.daddr.a6) && - proto == x->id.proto) { - atomic_inc(&x->refcnt); - spin_unlock_bh(&xfrm_state_lock); - return x; - } - } - spin_unlock_bh(&xfrm_state_lock); - return NULL; -} - -struct xfrm_state * -xfrm6_find_acq(u8 mode, u16 reqid, u8 proto, struct in6_addr *daddr, struct in6_addr *saddr, int create) -{ - struct xfrm_state *x, *x0; - unsigned h = ntohl(daddr->s6_addr32[2]^daddr->s6_addr32[3]); - - h = (h ^ (h>>16)) % XFRM_DST_HSIZE; - x0 = NULL; - - spin_lock_bh(&xfrm_state_lock); - list_for_each_entry(x, xfrm_state_bydst+h, bydst) { - if (x->props.family == AF_INET6 && - !memcmp(daddr, x->id.daddr.a6, sizeof(struct in6_addr)) && - mode == x->props.mode && - proto == x->id.proto && - !memcmp(saddr, x->props.saddr.a6, sizeof(struct in6_addr)) && - reqid == x->props.reqid && - x->km.state == XFRM_STATE_ACQ) { - if (!x0) - x0 = x; - if (x->id.spi) - continue; - x0 = x; - break; - } - } - if (x0) { - atomic_inc(&x0->refcnt); - } else if (create && (x0 = xfrm_state_alloc()) != NULL) { - memcpy(x0->sel.daddr.a6, daddr, sizeof(struct in6_addr)); - memcpy(x0->sel.saddr.a6, saddr, sizeof(struct in6_addr)); - x0->sel.prefixlen_d = 128; - x0->sel.prefixlen_s = 128; - memcpy(x0->props.saddr.a6, saddr, sizeof(struct in6_addr)); - x0->km.state = XFRM_STATE_ACQ; - memcpy(x0->id.daddr.a6, daddr, sizeof(struct in6_addr)); - x0->id.proto = proto; - x0->props.family = AF_INET6; - x0->props.mode = mode; - x0->props.reqid = reqid; - x0->lft.hard_add_expires_seconds = ACQ_EXPIRES; - atomic_inc(&x0->refcnt); - mod_timer(&x0->timer, jiffies + ACQ_EXPIRES*HZ); - atomic_inc(&x0->refcnt); - list_add_tail(&x0->bydst, xfrm_state_bydst+h); - wake_up(&km_waitq); - } - spin_unlock_bh(&xfrm_state_lock); - return x0; -} - -void -xfrm6_alloc_spi(struct xfrm_state *x, u32 minspi, u32 maxspi) -{ - u32 h; - struct xfrm_state *x0; - - if (x->id.spi) - return; - - if (minspi == maxspi) { - x0 = xfrm6_state_lookup((struct in6_addr*)x->id.daddr.a6, minspi, x->id.proto); - if (x0) { - xfrm_state_put(x0); - return; - } - x->id.spi = minspi; - } else { - u32 spi = 0; - minspi = ntohl(minspi); - maxspi = ntohl(maxspi); - for (h=0; hid.daddr.a6, htonl(spi), x->id.proto); - if (x0 == NULL) - break; - xfrm_state_put(x0); - } - x->id.spi = htonl(spi); - } - if (x->id.spi) { - spin_lock_bh(&xfrm_state_lock); - h = ntohl(x->id.daddr.a6[2]^x->id.daddr.a6[3]^x->id.spi^x->id.proto); - h = (h ^ (h>>10) ^ (h>>20)) % XFRM_DST_HSIZE; - list_add(&x->byspi, xfrm_state_byspi+h); - atomic_inc(&x->refcnt); - spin_unlock_bh(&xfrm_state_lock); - wake_up(&km_waitq); - } -} -#endif /* CONFIG_IPV6 || CONFIG_IPV6_MODULE */ diff -ruN -x CVS linux-2.5.63/net/ipv4/xfrm_user.c linux25/net/ipv4/xfrm_user.c --- linux-2.5.63/net/ipv4/xfrm_user.c 2003-02-25 04:05:34.000000000 +0900 +++ linux25/net/ipv4/xfrm_user.c 2003-03-04 20:38:16.000000000 +0900 @@ -234,8 +234,8 @@ switch (x->props.family) { case AF_INET: - x1 = xfrm_state_lookup(x->props.saddr.xfrm4_addr, - x->id.spi, x->id.proto); + x1 = xfrm4_state_lookup(x->props.saddr.xfrm4_addr, + x->id.spi, x->id.proto); break; #if defined(CONFIG_IPV6) || defined(CONFIG_IPV6_MODULE) case AF_INET6: @@ -265,7 +265,7 @@ switch (p->family) { case AF_INET: - x = xfrm_state_lookup(p->saddr.xfrm4_addr, p->spi, p->proto); + x = xfrm4_state_lookup(p->saddr.xfrm4_addr, p->spi, p->proto); break; #if defined(CONFIG_IPV6) || defined(CONFIG_IPV6_MODULE) case AF_INET6: @@ -395,7 +395,7 @@ switch (p->family) { case AF_INET: - x = xfrm_state_lookup(p->saddr.xfrm4_addr, p->spi, p->proto); + x = xfrm4_state_lookup(p->saddr.xfrm4_addr, p->spi, p->proto); break; #if defined(CONFIG_IPV6) || defined(CONFIG_IPV6_MODULE) case AF_INET6: diff -ruN -x CVS linux-2.5.63/net/ipv6/Kconfig linux25/net/ipv6/Kconfig --- linux-2.5.63/net/ipv6/Kconfig 2003-02-25 04:05:32.000000000 +0900 +++ linux25/net/ipv6/Kconfig 2003-03-04 20:38:16.000000000 +0900 @@ -17,5 +17,18 @@ See for details. -source "net/ipv6/netfilter/Kconfig" +config INET6_AH + tristate "IPv6: AH transformation" + ---help--- + Support for IPsec AH. + + If unsure, say Y. + +config INET6_ESP + tristate "IPv6: ESP transformation" + ---help--- + Support for IPsec ESP. + If unsure, say Y. + +source "net/ipv6/netfilter/Kconfig" diff -ruN -x CVS linux-2.5.63/net/ipv6/Makefile linux25/net/ipv6/Makefile --- linux-2.5.63/net/ipv6/Makefile 2003-02-25 04:05:39.000000000 +0900 +++ linux25/net/ipv6/Makefile 2003-03-05 00:28:59.000000000 +0900 @@ -10,4 +10,6 @@ exthdrs.o sysctl_net_ipv6.o datagram.o proc.o \ ip6_flowlabel.o ipv6_syms.o +obj-$(CONFIG_INET6_AH) += ah6.o +obj-$(CONFIG_INET6_ESP) += esp6.o obj-$(CONFIG_NETFILTER) += netfilter/ diff -ruN -x CVS linux-2.5.63/net/ipv6/ah6.c linux25/net/ipv6/ah6.c --- linux-2.5.63/net/ipv6/ah6.c 1970-01-01 09:00:00.000000000 +0900 +++ linux25/net/ipv6/ah6.c 2003-03-05 11:32:51.000000000 +0900 @@ -0,0 +1,361 @@ +/* + * Copyright (C)2002 USAGI/WIDE Project + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA + * + * Authors + * + * Mitsuru KANDA @USAGI : IPv6 Support + * Kazunori MIYAZAWA @USAGI : + * Kunihiro Ishiguro : + * + * This file is derived from net/ipv4/ah.c. + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#define AH_HLEN_NOICV 12 + +/* XXX no ipv6 ah specific */ +#define NIP6(addr) \ + ntohs((addr).s6_addr16[0]),\ + ntohs((addr).s6_addr16[1]),\ + ntohs((addr).s6_addr16[2]),\ + ntohs((addr).s6_addr16[3]),\ + ntohs((addr).s6_addr16[4]),\ + ntohs((addr).s6_addr16[5]),\ + ntohs((addr).s6_addr16[6]),\ + ntohs((addr).s6_addr16[7]) + +int ah6_output(struct sk_buff *skb) +{ + int err; + int hdr_len = sizeof(struct ipv6hdr); + struct dst_entry *dst = skb->dst; + struct xfrm_state *x = dst->xfrm; + struct ipv6hdr *iph = NULL; + struct ip_auth_hdr *ah; + struct ah_data *ahp; + u16 nh_offset = 0; + u8 nexthdr; +printk(KERN_DEBUG "%s\n", __FUNCTION__); + if (skb->ip_summed == CHECKSUM_HW && skb_checksum_help(skb) == NULL) + return -EINVAL; + + spin_lock_bh(&x->lock); + if ((err = xfrm_state_check_expire(x)) != 0) + goto error; + if ((err = xfrm_state_check_space(x, skb)) != 0) + goto error; + + if (x->props.mode) { + iph = skb->nh.ipv6h; + skb->nh.ipv6h = (struct ipv6hdr*)skb_push(skb, x->props.header_len); + skb->nh.ipv6h->version = 6; + skb->nh.ipv6h->payload_len = htons(skb->len - sizeof(struct ipv6hdr)); + skb->nh.ipv6h->nexthdr = IPPROTO_AH; + memcpy(&skb->nh.ipv6h->saddr, &x->props.saddr, sizeof(struct in6_addr)); + memcpy(&skb->nh.ipv6h->daddr, &x->id.daddr, sizeof(struct in6_addr)); + ah = (struct ip_auth_hdr*)(skb->nh.ipv6h+1); + ah->nexthdr = IPPROTO_IPV6; + } else { + hdr_len = skb->h.raw - skb->nh.raw; + iph = kmalloc(hdr_len, GFP_ATOMIC); + if (!iph) { + err = -ENOMEM; + goto error; + } + memcpy(iph, skb->data, hdr_len); + skb->nh.ipv6h = (struct ipv6hdr*)skb_push(skb, x->props.header_len); + memcpy(skb->nh.ipv6h, iph, hdr_len); + nexthdr = xfrm6_clear_mutable_options(skb, &nh_offset, XFRM_POLICY_OUT); + if (nexthdr == 0) + goto error; + + skb->nh.raw[nh_offset] = IPPROTO_AH; + skb->nh.ipv6h->payload_len = htons(skb->len - sizeof(struct ipv6hdr)); + ah = (struct ip_auth_hdr*)(skb->nh.raw+hdr_len); + skb->h.raw = (unsigned char*) ah; + ah->nexthdr = nexthdr; + } + + skb->nh.ipv6h->priority = 0; + skb->nh.ipv6h->flow_lbl[0] = 0; + skb->nh.ipv6h->flow_lbl[1] = 0; + skb->nh.ipv6h->flow_lbl[2] = 0; + skb->nh.ipv6h->hop_limit = 0; + + ahp = x->data; + ah->hdrlen = (XFRM_ALIGN8(ahp->icv_trunc_len + + AH_HLEN_NOICV) >> 2) - 2; + + ah->reserved = 0; + ah->spi = x->id.spi; + ah->seq_no = htonl(++x->replay.oseq); + ahp->icv(ahp, skb, ah->auth_data); + + if (x->props.mode) { + skb->nh.ipv6h->hop_limit = iph->hop_limit; + skb->nh.ipv6h->priority = iph->priority; + skb->nh.ipv6h->flow_lbl[0] = iph->flow_lbl[0]; + skb->nh.ipv6h->flow_lbl[1] = iph->flow_lbl[1]; + skb->nh.ipv6h->flow_lbl[2] = iph->flow_lbl[2]; + } else { + memcpy(skb->nh.ipv6h, iph, hdr_len); + skb->nh.raw[nh_offset] = IPPROTO_AH; + skb->nh.ipv6h->payload_len = htons(skb->len - sizeof(struct ipv6hdr)); + kfree (iph); + } + + skb->nh.raw = skb->data; + + x->curlft.bytes += skb->len; + x->curlft.packets++; + spin_unlock_bh(&x->lock); + if ((skb->dst = dst_pop(dst)) == NULL) + goto error_nolock; + return NET_XMIT_BYPASS; +error: + spin_unlock_bh(&x->lock); +error_nolock: + kfree_skb(skb); + return err; +} + +int ah6_input(struct xfrm_state *x, struct sk_buff *skb) +{ + int ah_hlen; + struct ipv6hdr *iph; + struct ipv6_auth_hdr *ah; + struct ah_data *ahp; + unsigned char *tmp_hdr = NULL; + int hdr_len = skb->h.raw - skb->nh.raw; + u8 nexthdr = 0; + + if (!pskb_may_pull(skb, sizeof(struct ip_auth_hdr))) + goto out; + + ah = (struct ipv6_auth_hdr*)skb->data; + ahp = x->data; + ah_hlen = (ah->hdrlen + 2) << 2; + + if (ah_hlen != XFRM_ALIGN8(ahp->icv_full_len + AH_HLEN_NOICV) && + ah_hlen != XFRM_ALIGN8(ahp->icv_trunc_len + AH_HLEN_NOICV)) + goto out; + + if (!pskb_may_pull(skb, ah_hlen)) + goto out; + + /* We are going to _remove_ AH header to keep sockets happy, + * so... Later this can change. */ + if (skb_cloned(skb) && + pskb_expand_head(skb, 0, 0, GFP_ATOMIC)) + goto out; + + tmp_hdr = kmalloc(hdr_len, GFP_ATOMIC); + if (!tmp_hdr) + goto out; + memcpy(tmp_hdr, skb->nh.raw, hdr_len); + ah = (struct ipv6_auth_hdr*)skb->data; + iph = skb->nh.ipv6h; + + { + u8 auth_data[ahp->icv_trunc_len]; + + memcpy(auth_data, ah->auth_data, ahp->icv_trunc_len); + skb_push(skb, skb->data - skb->nh.raw); + ahp->icv(ahp, skb, ah->auth_data); + if (memcmp(ah->auth_data, auth_data, ahp->icv_trunc_len)) { + if (net_ratelimit()) + printk(KERN_WARNING "ipsec ah authentication error\n"); + x->stats.integrity_failed++; + goto free_out; + } + } + + nexthdr = ah->nexthdr; + skb->nh.raw = skb_pull(skb, (ah->hdrlen+2)<<2); + memcpy(skb->nh.raw, tmp_hdr, hdr_len); + skb->nh.ipv6h->payload_len = htons(skb->len - sizeof(struct ipv6hdr)); + skb_pull(skb, hdr_len); + skb->h.raw = skb->data; + + + kfree(tmp_hdr); + + return nexthdr; + +free_out: + kfree(tmp_hdr); +out: + return -EINVAL; +} + +void ah6_err(struct sk_buff *skb, struct inet6_skb_parm *opt, + int type, int code, int offset, __u32 info) +{ + struct ipv6hdr *iph = (struct ipv6hdr*)skb->data; + struct ip_auth_hdr *ah = (struct ip_auth_hdr*)(skb->data+offset); + struct xfrm_state *x; + + if (type != ICMPV6_DEST_UNREACH || + type != ICMPV6_PKT_TOOBIG) + return; + + x = xfrm6_state_lookup(&iph->daddr, ah->spi, IPPROTO_AH); + if (!x) + return; + + printk(KERN_DEBUG "pmtu discvovery on SA AH/%08x/" + "%04x:%04x:%04x:%04x:%04x:%04x:%04x:%04x\n", + ntohl(ah->spi), NIP6(iph->daddr)); + + xfrm_state_put(x); +} + +static int ah6_init_state(struct xfrm_state *x, void *args) +{ + struct ah_data *ahp = NULL; + struct xfrm_algo_desc *aalg_desc; + + /* null auth can use a zero length key */ + if (x->aalg->alg_key_len > 512) + goto error; + + ahp = kmalloc(sizeof(*ahp), GFP_KERNEL); + if (ahp == NULL) + return -ENOMEM; + + memset(ahp, 0, sizeof(*ahp)); + + ahp->key = x->aalg->alg_key; + ahp->key_len = (x->aalg->alg_key_len+7)/8; + ahp->tfm = crypto_alloc_tfm(x->aalg->alg_name, 0); + if (!ahp->tfm) + goto error; + ahp->icv = ah_hmac_digest; + + /* + * Lookup the algorithm description maintained by xfrm_algo, + * verify crypto transform properties, and store information + * we need for AH processing. This lookup cannot fail here + * after a successful crypto_alloc_tfm(). + */ + aalg_desc = xfrm_aalg_get_byname(x->aalg->alg_name); + BUG_ON(!aalg_desc); + + if (aalg_desc->uinfo.auth.icv_fullbits/8 != + crypto_tfm_alg_digestsize(ahp->tfm)) { + printk(KERN_INFO "AH: %s digestsize %u != %hu\n", + x->aalg->alg_name, crypto_tfm_alg_digestsize(ahp->tfm), + aalg_desc->uinfo.auth.icv_fullbits/8); + goto error; + } + + ahp->icv_full_len = aalg_desc->uinfo.auth.icv_fullbits/8; + ahp->icv_trunc_len = aalg_desc->uinfo.auth.icv_truncbits/8; + + ahp->work_icv = kmalloc(ahp->icv_full_len, GFP_KERNEL); + if (!ahp->work_icv) + goto error; + + x->props.header_len = XFRM_ALIGN8(ahp->icv_trunc_len + AH_HLEN_NOICV); + if (x->props.mode) + x->props.header_len += 20; + x->data = ahp; + + return 0; + +error: + if (ahp) { + if (ahp->work_icv) + kfree(ahp->work_icv); + if (ahp->tfm) + crypto_free_tfm(ahp->tfm); + kfree(ahp); + } + return -EINVAL; +} + +static void ah6_destroy(struct xfrm_state *x) +{ + struct ah_data *ahp = x->data; + + if (ahp->work_icv) { + kfree(ahp->work_icv); + ahp->work_icv = NULL; + } + if (ahp->tfm) { + crypto_free_tfm(ahp->tfm); + ahp->tfm = NULL; + } +} + +static struct xfrm_type ah6_type = +{ + .description = "AH6", + .proto = IPPROTO_AH, + .init_state = ah6_init_state, + .destructor = ah6_destroy, + .input = ah6_input, + .output = ah6_output +}; + +static struct inet6_protocol ah6_protocol = { + .handler = xfrm6_rcv, + .err_handler = ah6_err, +}; + +int __init ah6_init(void) +{ + SET_MODULE_OWNER(&ah6_type); + + if (xfrm6_register_type(&ah6_type) < 0) { + printk(KERN_INFO "ipv6 ah init: can't add xfrm type\n"); + return -EAGAIN; + } + + if (inet6_add_protocol(&ah6_protocol, IPPROTO_AH) < 0) { + printk(KERN_INFO "ipv6 ah init: can't add protocol\n"); + xfrm6_unregister_type(&ah6_type); + return -EAGAIN; + } + + return 0; +} + +static void __exit ah6_fini(void) +{ + if (inet6_del_protocol(&ah6_protocol, IPPROTO_AH) < 0) + printk(KERN_INFO "ipv6 ah close: can't remove protocol\n"); + + if (xfrm6_unregister_type(&ah6_type) < 0) + printk(KERN_INFO "ipv6 ah close: can't remove xfrm type\n"); + +} + +module_init(ah6_init); +module_exit(ah6_fini); + +MODULE_LICENSE("GPL"); diff -ruN -x CVS linux-2.5.63/net/ipv6/esp6.c linux25/net/ipv6/esp6.c --- linux-2.5.63/net/ipv6/esp6.c 1970-01-01 09:00:00.000000000 +0900 +++ linux25/net/ipv6/esp6.c 2003-03-05 11:33:20.000000000 +0900 @@ -0,0 +1,526 @@ +/* + * Copyright (C)2002 USAGI/WIDE Project + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA + * + * Authors + * + * Mitsuru KANDA @USAGI : IPv6 Support + * Kazunori MIYAZAWA @USAGI : + * Kunihiro Ishiguro : + * + * This file is derived from net/ipv4/esp.c + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#define MAX_SG_ONSTACK 4 + +/* BUGS: + * - we assume replay seqno is always present. + */ + +/* Move to common area: it is shared with AH. */ +/* Common with AH after some work on arguments. */ + +/* XXX no ipv6 esp specific */ +#define NIP6(addr) \ + ntohs((addr).s6_addr16[0]),\ + ntohs((addr).s6_addr16[1]),\ + ntohs((addr).s6_addr16[2]),\ + ntohs((addr).s6_addr16[3]),\ + ntohs((addr).s6_addr16[4]),\ + ntohs((addr).s6_addr16[5]),\ + ntohs((addr).s6_addr16[6]),\ + ntohs((addr).s6_addr16[7]) + +static int get_offset(u8 *packet, u32 packet_len, u8 *nexthdr, struct ipv6_opt_hdr **prevhdr) +{ + u16 offset = sizeof(struct ipv6hdr); + struct ipv6_opt_hdr *exthdr = (struct ipv6_opt_hdr*)(packet + offset); + u8 nextnexthdr; + + *nexthdr = ((struct ipv6hdr*)packet)->nexthdr; + + while (offset + 1 < packet_len) { + + switch (*nexthdr) { + + case NEXTHDR_HOP: + case NEXTHDR_ROUTING: + offset += ipv6_optlen(exthdr); + *nexthdr = exthdr->nexthdr; + *prevhdr = exthdr; + exthdr = (struct ipv6_opt_hdr*)(packet + offset); + break; + + case NEXTHDR_DEST: + nextnexthdr = + ((struct ipv6_opt_hdr*)(packet + offset + ipv6_optlen(exthdr)))->nexthdr; + /* XXX We know the option is inner dest opt + with next next header check. */ + if (nextnexthdr != NEXTHDR_HOP && + nextnexthdr != NEXTHDR_ROUTING && + nextnexthdr != NEXTHDR_DEST) { + return offset; + } + offset += ipv6_optlen(exthdr); + *nexthdr = exthdr->nexthdr; + *prevhdr = exthdr; + exthdr = (struct ipv6_opt_hdr*)(packet + offset); + break; + + default : + return offset; + } + } + + return offset; +} + +int esp6_output(struct sk_buff *skb) +{ + int err; + int hdr_len = 0; + struct dst_entry *dst = skb->dst; + struct xfrm_state *x = dst->xfrm; + struct ipv6hdr *iph = NULL, *top_iph; + struct ip_esp_hdr *esph; + struct crypto_tfm *tfm; + struct esp_data *esp; + struct sk_buff *trailer; + struct ipv6_opt_hdr *prevhdr = NULL; + int blksize; + int clen; + int alen; + int nfrags; + u8 nexthdr; +printk(KERN_DEBUG "%s\n", __FUNCTION__); + /* First, if the skb is not checksummed, complete checksum. */ + if (skb->ip_summed == CHECKSUM_HW && skb_checksum_help(skb) == NULL) + return -EINVAL; + + spin_lock_bh(&x->lock); + if ((err = xfrm_state_check_expire(x)) != 0) + goto error; + if ((err = xfrm_state_check_space(x, skb)) != 0) + goto error; + + err = -ENOMEM; + + /* Strip IP header in transport mode. Save it. */ + + if (!x->props.mode) { + hdr_len = get_offset(skb->nh.raw, skb->len, &nexthdr, &prevhdr); + iph = kmalloc(hdr_len, GFP_ATOMIC); + if (!iph) { + err = -ENOMEM; + goto error; + } + memcpy(iph, skb->nh.raw, hdr_len); + __skb_pull(skb, hdr_len); + } + + /* Now skb is pure payload to encrypt */ + + /* Round to block size */ + clen = skb->len; + + esp = x->data; + alen = esp->auth.icv_trunc_len; + tfm = esp->conf.tfm; + blksize = crypto_tfm_alg_blocksize(tfm); + clen = (clen + 2 + blksize-1)&~(blksize-1); + if (esp->conf.padlen) + clen = (clen + esp->conf.padlen-1)&~(esp->conf.padlen-1); + + if ((nfrags = skb_cow_data(skb, clen-skb->len+alen, &trailer)) < 0) { + if (!x->props.mode && iph) kfree(iph); + goto error; + } + + /* Fill padding... */ + do { + int i; + for (i=0; ilen - 2; i++) + *(u8*)(trailer->tail + i) = i+1; + } while (0); + *(u8*)(trailer->tail + clen-skb->len - 2) = (clen - skb->len)-2; + pskb_put(skb, trailer, clen - skb->len); + + if (x->props.mode) { + iph = skb->nh.ipv6h; + top_iph = (struct ipv6hdr*)skb_push(skb, x->props.header_len); + esph = (struct ip_esp_hdr*)(top_iph+1); + *(u8*)(trailer->tail - 1) = IPPROTO_IPV6; + top_iph->version = 6; + top_iph->priority = iph->priority; + top_iph->flow_lbl[0] = iph->flow_lbl[0]; + top_iph->flow_lbl[1] = iph->flow_lbl[1]; + top_iph->flow_lbl[2] = iph->flow_lbl[2]; + top_iph->nexthdr = IPPROTO_ESP; + top_iph->payload_len = htons(skb->len + alen); + top_iph->hop_limit = iph->hop_limit; + memcpy(&top_iph->saddr, (struct in6_addr *)&x->props.saddr, sizeof(struct ipv6hdr)); + memcpy(&top_iph->daddr, (struct in6_addr *)&x->id.daddr, sizeof(struct ipv6hdr)); + } else { + /* XXX exthdr */ + esph = (struct ip_esp_hdr*)skb_push(skb, x->props.header_len); + skb->h.raw = (unsigned char*)esph; + top_iph = (struct ipv6hdr*)skb_push(skb, hdr_len); + memcpy(top_iph, iph, hdr_len); + kfree(iph); + top_iph->payload_len = htons(skb->len + alen - sizeof(struct ipv6hdr)); + if (prevhdr) { + prevhdr->nexthdr = IPPROTO_ESP; + } else { + top_iph->nexthdr = IPPROTO_ESP; + } + *(u8*)(trailer->tail - 1) = nexthdr; + } + + esph->spi = x->id.spi; + esph->seq_no = htonl(++x->replay.oseq); + + if (esp->conf.ivlen) + crypto_cipher_set_iv(tfm, esp->conf.ivec, crypto_tfm_alg_ivsize(tfm)); + + do { + struct scatterlist sgbuf[nfrags>MAX_SG_ONSTACK ? 0 : nfrags]; + struct scatterlist *sg = sgbuf; + + if (unlikely(nfrags > MAX_SG_ONSTACK)) { + sg = kmalloc(sizeof(struct scatterlist)*nfrags, GFP_ATOMIC); + if (!sg) + goto error; + } + skb_to_sgvec(skb, sg, esph->enc_data+esp->conf.ivlen-skb->data, clen); + crypto_cipher_encrypt(tfm, sg, sg, clen); + if (unlikely(sg != sgbuf)) + kfree(sg); + } while (0); + + if (esp->conf.ivlen) { + memcpy(esph->enc_data, esp->conf.ivec, crypto_tfm_alg_ivsize(tfm)); + crypto_cipher_get_iv(tfm, esp->conf.ivec, crypto_tfm_alg_ivsize(tfm)); + } + + if (esp->auth.icv_full_len) { + esp->auth.icv(esp, skb, (u8*)esph-skb->data, + 8+esp->conf.ivlen+clen, trailer->tail); + pskb_put(skb, trailer, alen); + } + + skb->nh.raw = skb->data; + + x->curlft.bytes += skb->len; + x->curlft.packets++; + spin_unlock_bh(&x->lock); + if ((skb->dst = dst_pop(dst)) == NULL) + goto error_nolock; + return NET_XMIT_BYPASS; + +error: + spin_unlock_bh(&x->lock); +error_nolock: + kfree_skb(skb); + return err; +} + +int esp6_input(struct xfrm_state *x, struct sk_buff *skb) +{ + struct ipv6hdr *iph; + struct ip_esp_hdr *esph; + struct esp_data *esp = x->data; + struct sk_buff *trailer; + int blksize = crypto_tfm_alg_blocksize(esp->conf.tfm); + int alen = esp->auth.icv_trunc_len; + int elen = skb->len - 8 - esp->conf.ivlen - alen; + + int hdr_len = skb->h.raw - skb->nh.raw; + int nfrags; + u8 ret_nexthdr = 0; + unsigned char *tmp_hdr = NULL; + + if (!pskb_may_pull(skb, sizeof(struct ip_esp_hdr))) + goto out; + + if (elen <= 0 || (elen & (blksize-1))) + goto out; + + tmp_hdr = kmalloc(hdr_len, GFP_ATOMIC); + if (!tmp_hdr) + goto out; + memcpy(tmp_hdr, skb->nh.raw, hdr_len); + + /* If integrity check is required, do this. */ + if (esp->auth.icv_full_len) { + u8 sum[esp->auth.icv_full_len]; + u8 sum1[alen]; + + esp->auth.icv(esp, skb, 0, skb->len-alen, sum); + + if (skb_copy_bits(skb, skb->len-alen, sum1, alen)) + BUG(); + + if (unlikely(memcmp(sum, sum1, alen))) { + x->stats.integrity_failed++; + goto out; + } + } + + if ((nfrags = skb_cow_data(skb, 0, &trailer)) < 0) + goto out; + + skb->ip_summed = CHECKSUM_NONE; + + esph = (struct ip_esp_hdr*)skb->data; + iph = skb->nh.ipv6h; + + /* Get ivec. This can be wrong, check against another impls. */ + if (esp->conf.ivlen) + crypto_cipher_set_iv(esp->conf.tfm, esph->enc_data, crypto_tfm_alg_ivsize(esp->conf.tfm)); + + { + u8 nexthdr[2]; + struct scatterlist sgbuf[nfrags>MAX_SG_ONSTACK ? 0 : nfrags]; + struct scatterlist *sg = sgbuf; + u8 padlen; + + if (unlikely(nfrags > MAX_SG_ONSTACK)) { + sg = kmalloc(sizeof(struct scatterlist)*nfrags, GFP_ATOMIC); + if (!sg) + goto out; + } + skb_to_sgvec(skb, sg, 8+esp->conf.ivlen, elen); + crypto_cipher_decrypt(esp->conf.tfm, sg, sg, elen); + if (unlikely(sg != sgbuf)) + kfree(sg); + + if (skb_copy_bits(skb, skb->len-alen-2, nexthdr, 2)) + BUG(); + + padlen = nexthdr[0]; + if (padlen+2 >= elen) { + if (net_ratelimit()) { + printk(KERN_WARNING "ipsec esp packet is garbage padlen=%d, elen=%d\n", padlen+2, elen); + } + goto out; + } + /* ... check padding bits here. Silly. :-) */ + + ret_nexthdr = nexthdr[1]; + pskb_trim(skb, skb->len - alen - padlen - 2); + skb->h.raw = skb_pull(skb, 8 + esp->conf.ivlen); + skb->nh.raw += 8 + esp->conf.ivlen; + memcpy(skb->nh.raw, tmp_hdr, hdr_len); + } + kfree(tmp_hdr); + return ret_nexthdr; + +out: + return -EINVAL; +} + +static u32 esp6_get_max_size(struct xfrm_state *x, int mtu) +{ + struct esp_data *esp = x->data; + u32 blksize = crypto_tfm_alg_blocksize(esp->conf.tfm); + + if (x->props.mode) { + mtu = (mtu + 2 + blksize-1)&~(blksize-1); + } else { + /* The worst case. */ + mtu += 2 + blksize; + } + if (esp->conf.padlen) + mtu = (mtu + esp->conf.padlen-1)&~(esp->conf.padlen-1); + + return mtu + x->props.header_len + esp->auth.icv_full_len; +} + +void esp6_err(struct sk_buff *skb, struct inet6_skb_parm *opt, + int type, int code, int offset, __u32 info) +{ + struct ipv6hdr *iph = (struct ipv6hdr*)skb->data; + struct ip_esp_hdr *esph = (struct ip_esp_hdr*)(skb->data+offset); + struct xfrm_state *x; + + if (type != ICMPV6_DEST_UNREACH || + type != ICMPV6_PKT_TOOBIG) + return; + + x = xfrm6_state_lookup(&iph->daddr, esph->spi, IPPROTO_ESP); + if (!x) + return; + printk(KERN_DEBUG "pmtu discvovery on SA ESP/%08x/" + "%04x:%04x:%04x:%04x:%04x:%04x:%04x:%04x\n", + ntohl(esph->spi), NIP6(iph->daddr)); + xfrm_state_put(x); +} + +void esp6_destroy(struct xfrm_state *x) +{ + struct esp_data *esp = x->data; + + if (esp->conf.tfm) { + crypto_free_tfm(esp->conf.tfm); + esp->conf.tfm = NULL; + } + if (esp->conf.ivec) { + kfree(esp->conf.ivec); + esp->conf.ivec = NULL; + } + if (esp->auth.tfm) { + crypto_free_tfm(esp->auth.tfm); + esp->auth.tfm = NULL; + } + if (esp->auth.work_icv) { + kfree(esp->auth.work_icv); + esp->auth.work_icv = NULL; + } +} + +int esp6_init_state(struct xfrm_state *x, void *args) +{ + struct esp_data *esp = NULL; + + if (x->aalg) { + if (x->aalg->alg_key_len == 0 || x->aalg->alg_key_len > 512) + goto error; + } + if (x->ealg == NULL || x->ealg->alg_key_len == 0) + goto error; + + esp = kmalloc(sizeof(*esp), GFP_KERNEL); + if (esp == NULL) + return -ENOMEM; + + memset(esp, 0, sizeof(*esp)); + + if (x->aalg) { + struct xfrm_algo_desc *aalg_desc; + + esp->auth.key = x->aalg->alg_key; + esp->auth.key_len = (x->aalg->alg_key_len+7)/8; + esp->auth.tfm = crypto_alloc_tfm(x->aalg->alg_name, 0); + if (esp->auth.tfm == NULL) + goto error; + esp->auth.icv = esp_hmac_digest; + + aalg_desc = xfrm_aalg_get_byname(x->aalg->alg_name); + BUG_ON(!aalg_desc); + + if (aalg_desc->uinfo.auth.icv_fullbits/8 != + crypto_tfm_alg_digestsize(esp->auth.tfm)) { + printk(KERN_INFO "ESP: %s digestsize %u != %hu\n", + x->aalg->alg_name, + crypto_tfm_alg_digestsize(esp->auth.tfm), + aalg_desc->uinfo.auth.icv_fullbits/8); + goto error; + } + + esp->auth.icv_full_len = aalg_desc->uinfo.auth.icv_fullbits/8; + esp->auth.icv_trunc_len = aalg_desc->uinfo.auth.icv_truncbits/8; + + esp->auth.work_icv = kmalloc(esp->auth.icv_full_len, GFP_KERNEL); + if (!esp->auth.work_icv) + goto error; + } + esp->conf.key = x->ealg->alg_key; + esp->conf.key_len = (x->ealg->alg_key_len+7)/8; + esp->conf.tfm = crypto_alloc_tfm(x->ealg->alg_name, CRYPTO_TFM_MODE_CBC); + if (esp->conf.tfm == NULL) + goto error; + esp->conf.ivlen = crypto_tfm_alg_ivsize(esp->conf.tfm); + esp->conf.padlen = 0; + if (esp->conf.ivlen) { + esp->conf.ivec = kmalloc(esp->conf.ivlen, GFP_KERNEL); + get_random_bytes(esp->conf.ivec, esp->conf.ivlen); + } + crypto_cipher_setkey(esp->conf.tfm, esp->conf.key, esp->conf.key_len); + x->props.header_len = 8 + esp->conf.ivlen; + if (x->props.mode) + x->props.header_len += 40; /* XXX ext hdr */ + x->data = esp; + return 0; + +error: + if (esp) { + if (esp->auth.tfm) + crypto_free_tfm(esp->auth.tfm); + if (esp->auth.work_icv) + kfree(esp->auth.work_icv); + if (esp->conf.tfm) + crypto_free_tfm(esp->conf.tfm); + kfree(esp); + } + return -EINVAL; +} + +static struct xfrm_type esp6_type = +{ + .description = "ESP6", + .proto = IPPROTO_ESP, + .init_state = esp6_init_state, + .destructor = esp6_destroy, + .get_max_size = esp6_get_max_size, + .input = esp6_input, + .output = esp6_output +}; + +static struct inet6_protocol esp6_protocol = { + .handler = xfrm6_rcv, + .err_handler = esp6_err, +}; + +int __init esp6_init(void) +{ + SET_MODULE_OWNER(&esp6_type); + if (xfrm6_register_type(&esp6_type) < 0) { + printk(KERN_INFO "ipv6 esp init: can't add xfrm type\n"); + return -EAGAIN; + } + if (inet6_add_protocol(&esp6_protocol, IPPROTO_ESP) < 0) { + printk(KERN_INFO "ipv6 esp init: can't add protocol\n"); + xfrm6_unregister_type(&esp6_type); + return -EAGAIN; + } + + return 0; +} + +static void __exit esp6_fini(void) +{ + if (inet6_del_protocol(&esp6_protocol, IPPROTO_ESP) < 0) + printk(KERN_INFO "ipv6 esp close: can't remove protocol\n"); + if (xfrm6_unregister_type(&esp6_type) < 0) + printk(KERN_INFO "ipv6 esp close: can't remove xfrm type\n"); +} + +module_init(esp6_init); +module_exit(esp6_fini); + +MODULE_LICENSE("GPL"); diff -ruN -x CVS linux-2.5.63/net/ipv6/ip6_input.c linux25/net/ipv6/ip6_input.c --- linux-2.5.63/net/ipv6/ip6_input.c 2003-02-25 04:05:39.000000000 +0900 +++ linux25/net/ipv6/ip6_input.c 2003-03-04 20:38:16.000000000 +0900 @@ -150,7 +150,8 @@ It would be stupid to detect for optional headers, which are missing with probability of 200% */ - if (nexthdr != IPPROTO_TCP && nexthdr != IPPROTO_UDP) { + if (nexthdr != IPPROTO_TCP && nexthdr != IPPROTO_UDP && + nexthdr != NEXTHDR_AUTH && nexthdr != NEXTHDR_ESP) { nhoff = ipv6_parse_exthdrs(&skb, nhoff); if (nhoff < 0) return 0; diff -ruN -x CVS linux-2.5.63/net/ipv6/ip6_output.c linux25/net/ipv6/ip6_output.c --- linux-2.5.63/net/ipv6/ip6_output.c 2003-02-25 04:05:06.000000000 +0900 +++ linux25/net/ipv6/ip6_output.c 2003-03-04 20:38:16.000000000 +0900 @@ -192,6 +192,11 @@ int seg_len = skb->len; int hlimit; u32 mtu; + int err = 0; + + if ((err = xfrm_lookup(&skb->dst, fl, sk, 0)) < 0) { + return err; + } if (opt) { int head_room; @@ -576,6 +581,13 @@ } pktlength = length; + if (dst) { + if ((err = xfrm_lookup(&dst, fl, sk, 0)) < 0) { + dst_release(dst); + return -ENETUNREACH; + } + } + if (hlimit < 0) { if (ipv6_addr_is_multicast(fl->fl6_dst)) hlimit = np->mcast_hops; @@ -630,10 +642,8 @@ err = 0; if (flags&MSG_PROBE) goto out; - - skb = sock_alloc_send_skb(sk, pktlength + 15 + - dev->hard_header_len, - flags & MSG_DONTWAIT, &err); + /* alloc skb with mtu as we do in the IPv4 stack for IPsec */ + skb = sock_alloc_send_skb(sk, mtu, flags & MSG_DONTWAIT, &err); if (skb == NULL) { IP6_INC_STATS(Ip6OutDiscards); @@ -663,6 +673,8 @@ err = getfrag(data, &hdr->saddr, ((char *) hdr) + (pktlength - length), 0, length); + if (!opt || !opt->dst1opt) + skb->h.raw = ((char *) hdr) + (pktlength - length); if (!err) { IP6_INC_STATS(Ip6OutRequests); diff -ruN -x CVS linux-2.5.63/net/ipv6/ndisc.c linux25/net/ipv6/ndisc.c --- linux-2.5.63/net/ipv6/ndisc.c 2003-02-25 04:05:34.000000000 +0900 +++ linux25/net/ipv6/ndisc.c 2003-03-05 11:30:41.000000000 +0900 @@ -71,6 +71,7 @@ #include #include +#include #include #include @@ -335,8 +336,6 @@ unsigned char ha[MAX_ADDR_LEN]; unsigned char *h_dest = NULL; - skb_reserve(skb, (dev->hard_header_len + 15) & ~15); - if (dev->hard_header) { if (ipv6_addr_type(daddr) & IPV6_ADDR_MULTICAST) { ndisc_mc_map(daddr, ha, dev, 1); @@ -373,10 +372,50 @@ * Send a Neighbour Advertisement */ +int ndisc_output(struct sk_buff *skb) +{ + if (skb) { + struct neighbour *neigh = (skb->dst ? skb->dst->neighbour : NULL); + if (ndisc_build_ll_hdr(skb, skb->dev, &skb->nh.ipv6h->daddr, neigh, skb->len) == 0) { + kfree_skb(skb); + return -EINVAL; + } + dev_queue_xmit(skb); + return 0; + } + return -EINVAL; +} + +static inline void ndisc_rt_init(struct rt6_info *rt, struct net_device *dev, + struct neighbour *neigh) +{ + rt->rt6i_dev = dev; + rt->rt6i_nexthop = neigh; + rt->rt6i_expires = 0; + rt->rt6i_flags = RTF_LOCAL; + rt->rt6i_metric = 0; + rt->rt6i_hoplimit = 255; + rt->u.dst.output = ndisc_output; +} + +static inline void ndisc_flow_init(struct flowi *fl, u8 type, + struct in6_addr *saddr, struct in6_addr *daddr) +{ + memset(fl, 0, sizeof(*fl)); + fl->fl6_src = saddr; + fl->fl6_dst = daddr; + fl->proto = IPPROTO_ICMPV6; + fl->uli_u.icmpt.type = type; + fl->uli_u.icmpt.code = 0; +} + static void ndisc_send_na(struct net_device *dev, struct neighbour *neigh, struct in6_addr *daddr, struct in6_addr *solicited_addr, - int router, int solicited, int override, int inc_opt) + int router, int solicited, int override, int inc_opt) { + struct flowi fl; + struct rt6_info *rt = NULL; + struct dst_entry* dst; struct sock *sk = ndisc_socket->sk; struct nd_msg *msg; int len; @@ -385,6 +424,22 @@ len = sizeof(struct icmp6hdr) + sizeof(struct in6_addr); + rt = ndisc_get_dummy_rt(); + if (!rt) + return; + + ndisc_flow_init(&fl, NDISC_NEIGHBOUR_ADVERTISEMENT, solicited_addr, daddr); + ndisc_rt_init(rt, dev, neigh); + + dst = (struct dst_entry*)rt; + dst_clone(dst); + + err = xfrm_lookup(&dst, &fl, NULL, 0); + if (err < 0) { + dst_release(dst); + return; + } + if (inc_opt) { if (dev->addr_len) len += NDISC_OPT_SPACE(dev->addr_len); @@ -400,14 +455,10 @@ return; } - if (ndisc_build_ll_hdr(skb, dev, daddr, neigh, len) == 0) { - kfree_skb(skb); - return; - } - + skb_reserve(skb, (dev->hard_header_len + 15) & ~15); ip6_nd_hdr(sk, skb, dev, solicited_addr, daddr, IPPROTO_ICMPV6, len); - msg = (struct nd_msg *) skb_put(skb, len); + skb->h.raw = (unsigned char*) msg = (struct nd_msg *) skb_put(skb, len); msg->icmph.icmp6_type = NDISC_NEIGHBOUR_ADVERTISEMENT; msg->icmph.icmp6_code = 0; @@ -430,7 +481,9 @@ csum_partial((__u8 *) msg, len, 0)); - dev_queue_xmit(skb); + dst_clone(dst); + skb->dst = dst; + dst_output(skb); ICMP6_INC_STATS(Icmp6OutNeighborAdvertisements); ICMP6_INC_STATS(Icmp6OutMsgs); @@ -440,6 +493,9 @@ struct in6_addr *solicit, struct in6_addr *daddr, struct in6_addr *saddr) { + struct flowi fl; + struct rt6_info *rt = NULL; + struct dst_entry* dst; struct sock *sk = ndisc_socket->sk; struct sk_buff *skb; struct nd_msg *msg; @@ -454,6 +510,22 @@ saddr = &addr_buf; } + rt = ndisc_get_dummy_rt(); + if (!rt) + return; + + ndisc_flow_init(&fl, NDISC_NEIGHBOUR_SOLICITATION, saddr, daddr); + ndisc_rt_init(rt, dev, neigh); + + dst = (struct dst_entry*)rt; + dst_clone(dst); + + err = xfrm_lookup(&dst, &fl, NULL, 0); + if (err < 0) { + dst_release(dst); + return; + } + len = sizeof(struct icmp6hdr) + sizeof(struct in6_addr); send_llinfo = dev->addr_len && ipv6_addr_type(saddr) != IPV6_ADDR_ANY; if (send_llinfo) @@ -466,14 +538,10 @@ return; } - if (ndisc_build_ll_hdr(skb, dev, daddr, neigh, len) == 0) { - kfree_skb(skb); - return; - } - + skb_reserve(skb, (dev->hard_header_len + 15) & ~15); ip6_nd_hdr(sk, skb, dev, saddr, daddr, IPPROTO_ICMPV6, len); - msg = (struct nd_msg *)skb_put(skb, len); + skb->h.raw = (unsigned char*) msg = (struct nd_msg *)skb_put(skb, len); msg->icmph.icmp6_type = NDISC_NEIGHBOUR_SOLICITATION; msg->icmph.icmp6_code = 0; msg->icmph.icmp6_cksum = 0; @@ -492,7 +560,9 @@ csum_partial((__u8 *) msg, len, 0)); /* send it! */ - dev_queue_xmit(skb); + dst_clone(dst); + skb->dst = dst; + dst_output(skb); ICMP6_INC_STATS(Icmp6OutNeighborSolicits); ICMP6_INC_STATS(Icmp6OutMsgs); @@ -501,6 +571,9 @@ void ndisc_send_rs(struct net_device *dev, struct in6_addr *saddr, struct in6_addr *daddr) { + struct flowi fl; + struct rt6_info *rt = NULL; + struct dst_entry* dst; struct sock *sk = ndisc_socket->sk; struct sk_buff *skb; struct icmp6hdr *hdr; @@ -508,6 +581,22 @@ int len; int err; + rt = ndisc_get_dummy_rt(); + if (!rt) + return; + + ndisc_flow_init(&fl, NDISC_ROUTER_SOLICITATION, saddr, daddr); + ndisc_rt_init(rt, dev, NULL); + + dst = (struct dst_entry*)rt; + dst_clone(dst); + + err = xfrm_lookup(&dst, &fl, NULL, 0); + if (err < 0) { + dst_release(dst); + return; + } + len = sizeof(struct icmp6hdr); if (dev->addr_len) len += NDISC_OPT_SPACE(dev->addr_len); @@ -519,14 +608,10 @@ return; } - if (ndisc_build_ll_hdr(skb, dev, daddr, NULL, len) == 0) { - kfree_skb(skb); - return; - } - + skb_reserve(skb, (dev->hard_header_len + 15) & ~15); ip6_nd_hdr(sk, skb, dev, saddr, daddr, IPPROTO_ICMPV6, len); - hdr = (struct icmp6hdr *) skb_put(skb, len); + skb->h.raw = (unsigned char*) hdr = (struct icmp6hdr *) skb_put(skb, len); hdr->icmp6_type = NDISC_ROUTER_SOLICITATION; hdr->icmp6_code = 0; hdr->icmp6_cksum = 0; @@ -543,7 +628,9 @@ csum_partial((__u8 *) hdr, len, 0)); /* send it! */ - dev_queue_xmit(skb); + dst_clone(dst); + skb->dst = dst; + dst_output(skb); ICMP6_INC_STATS(Icmp6OutRouterSolicits); ICMP6_INC_STATS(Icmp6OutMsgs); @@ -1125,6 +1212,8 @@ struct in6_addr *addrp; struct net_device *dev; struct rt6_info *rt; + struct dst_entry *dst; + struct flowi fl; u8 *opt; int rd_len; int err; @@ -1136,6 +1225,22 @@ if (rt == NULL) return; + dst = (struct dst_entry*)rt; + + if (ipv6_get_lladdr(dev, &saddr_buf)) { + ND_PRINTK1("redirect: no link_local addr for dev\n"); + return; + } + + ndisc_flow_init(&fl, NDISC_REDIRECT, &saddr_buf, &skb->nh.ipv6h->saddr); + + dst_clone(dst); + err = xfrm_lookup(&dst, &fl, NULL, 0); + if (err) { + dst_release(dst); + return; + } + if (rt->rt6i_flags & RTF_GATEWAY) { ND_PRINTK1("ndisc_send_redirect: not a neighbour\n"); dst_release(&rt->u.dst); @@ -1164,11 +1269,6 @@ rd_len &= ~0x7; len += rd_len; - if (ipv6_get_lladdr(dev, &saddr_buf)) { - ND_PRINTK1("redirect: no link_local addr for dev\n"); - return; - } - buff = sock_alloc_send_skb(sk, MAX_HEADER + len + dev->hard_header_len + 15, 0, &err); if (buff == NULL) { @@ -1178,15 +1278,11 @@ hlen = 0; - if (ndisc_build_ll_hdr(buff, dev, &skb->nh.ipv6h->saddr, NULL, len) == 0) { - kfree_skb(buff); - return; - } - + skb_reserve(skb, (dev->hard_header_len + 15) & ~15); ip6_nd_hdr(sk, buff, dev, &saddr_buf, &skb->nh.ipv6h->saddr, IPPROTO_ICMPV6, len); - icmph = (struct icmp6hdr *) skb_put(buff, len); + skb->h.raw = (unsigned char*) icmph = (struct icmp6hdr *) skb_put(buff, len); memset(icmph, 0, sizeof(struct icmp6hdr)); icmph->icmp6_type = NDISC_REDIRECT; @@ -1224,7 +1320,8 @@ len, IPPROTO_ICMPV6, csum_partial((u8 *) icmph, len, 0)); - dev_queue_xmit(buff); + skb->dst = dst; + dst_output(skb); ICMP6_INC_STATS(Icmp6OutRedirects); ICMP6_INC_STATS(Icmp6OutMsgs); diff -ruN -x CVS linux-2.5.63/net/ipv6/raw.c linux25/net/ipv6/raw.c --- linux-2.5.63/net/ipv6/raw.c 2003-02-25 04:05:16.000000000 +0900 +++ linux25/net/ipv6/raw.c 2003-03-04 20:38:16.000000000 +0900 @@ -45,6 +45,7 @@ #include #include +#include struct sock *raw_v6_htable[RAWV6_HTABLE_SIZE]; rwlock_t raw_v6_lock = RW_LOCK_UNLOCKED; @@ -304,6 +305,11 @@ struct inet_opt *inet = inet_sk(sk); struct raw6_opt *raw_opt = raw6_sk(sk); + if (!xfrm6_policy_check(sk, XFRM_POLICY_IN, skb)) { + kfree_skb(skb); + return NET_RX_DROP; + } + if (!raw_opt->checksum) skb->ip_summed = CHECKSUM_UNNECESSARY; diff -ruN -x CVS linux-2.5.63/net/ipv6/route.c linux25/net/ipv6/route.c --- linux-2.5.63/net/ipv6/route.c 2003-02-25 04:05:39.000000000 +0900 +++ linux25/net/ipv6/route.c 2003-03-05 11:30:41.000000000 +0900 @@ -49,6 +49,8 @@ #include #include #include +#include +#include #include @@ -128,6 +130,12 @@ rwlock_t rt6_lock = RW_LOCK_UNLOCKED; +/* Dummy rt for ndisc */ +struct rt6_info *ndisc_get_dummy_rt() +{ + return dst_alloc(&ip6_dst_ops); +} + /* * Route lookup. Any rt6_lock is implied. */ @@ -1809,6 +1817,14 @@ #endif +int xfrm6_dst_lookup(struct xfrm_dst **dst, struct flowi *fl) +{ + int err = 0; + *dst = (struct xfrm_dst*)ip6_route_output(NULL, fl); + if (!*dst) + err = -ENETUNREACH; + return err; +} void __init ip6_route_init(void) { @@ -1817,6 +1833,7 @@ 0, SLAB_HWCACHE_ALIGN, NULL, NULL); fib6_init(); + xfrm_dst_lookup_register(xfrm6_dst_lookup, AF_INET6); #ifdef CONFIG_PROC_FS proc_net_create("ipv6_route", 0, rt6_proc_info); proc_net_create("rt6_stats", 0, rt6_proc_stats); @@ -1830,7 +1847,7 @@ proc_net_remove("ipv6_route"); proc_net_remove("rt6_stats"); #endif - + xfrm_dst_lookup_unregister(AF_INET6); rt6_ifdown(NULL); fib6_gc_cleanup(); } diff -ruN -x CVS linux-2.5.63/net/ipv6/tcp_ipv6.c linux25/net/ipv6/tcp_ipv6.c --- linux-2.5.63/net/ipv6/tcp_ipv6.c 2003-02-25 04:05:33.000000000 +0900 +++ linux25/net/ipv6/tcp_ipv6.c 2003-03-04 20:38:16.000000000 +0900 @@ -50,6 +50,7 @@ #include #include #include +#include #include @@ -677,6 +678,9 @@ fl.nl_u.ip6_u.daddr = rt0->addr; } + if (!fl.fl6_src) + fl.fl6_src = &np->saddr; + dst = ip6_route_output(sk, &fl); if ((err = dst->error) != 0) { @@ -1637,6 +1641,9 @@ if (sk_filter(sk, skb, 0)) goto discard_and_relse; + if (!xfrm6_policy_check(sk, XFRM_POLICY_IN, skb)) + goto discard_it; + skb->dev = NULL; bh_lock_sock(sk); @@ -1652,6 +1659,9 @@ return ret; no_tcp_socket: + if (!xfrm6_policy_check(NULL, XFRM_POLICY_IN, skb)) + goto discard_and_relse; + if (skb->len < (th->doff<<2) || tcp_checksum_complete(skb)) { bad_packet: TCP_INC_STATS_BH(TcpInErrs); @@ -1671,8 +1681,11 @@ discard_and_relse: sock_put(sk); goto discard_it; - + do_time_wait: + if (!xfrm6_policy_check(NULL, XFRM_POLICY_IN, skb)) + goto discard_and_relse; + if (skb->len < (th->doff<<2) || tcp_checksum_complete(skb)) { TCP_INC_STATS_BH(TcpInErrs); sock_put(sk); diff -ruN -x CVS linux-2.5.63/net/ipv6/udp.c linux25/net/ipv6/udp.c --- linux-2.5.63/net/ipv6/udp.c 2003-02-25 04:05:40.000000000 +0900 +++ linux25/net/ipv6/udp.c 2003-03-04 20:38:16.000000000 +0900 @@ -50,6 +50,7 @@ #include #include +#include DEFINE_SNMP_STAT(struct udp_mib, udp_stats_in6); @@ -541,6 +542,11 @@ static inline int udpv6_queue_rcv_skb(struct sock * sk, struct sk_buff *skb) { + if (!xfrm6_policy_check(sk, XFRM_POLICY_IN, skb)) { + kfree_skb(skb); + return -1; + } + #if defined(CONFIG_FILTER) if (sk->filter && skb->ip_summed != CHECKSUM_UNNECESSARY) { if ((unsigned short)csum_fold(skb_checksum(skb, 0, skb->len, skb->csum))) { @@ -646,6 +652,9 @@ if (!pskb_may_pull(skb, sizeof(struct udphdr))) goto short_packet; + if (!xfrm6_policy_check(NULL, XFRM_POLICY_IN, skb)) + goto discard; + saddr = &skb->nh.ipv6h->saddr; daddr = &skb->nh.ipv6h->daddr; uh = skb->h.uh; diff -ruN -x CVS linux-2.5.63/net/key/af_key.c linux25/net/key/af_key.c --- linux-2.5.63/net/key/af_key.c 2003-02-25 04:05:13.000000000 +0900 +++ linux25/net/key/af_key.c 2003-03-04 20:38:16.000000000 +0900 @@ -550,8 +550,8 @@ switch (((struct sockaddr *)(addr + 1))->sa_family) { case AF_INET: - x = xfrm_state_lookup(((struct sockaddr_in *)(addr + 1))->sin_addr.s_addr, - sa->sadb_sa_spi, proto); + x = xfrm4_state_lookup(((struct sockaddr_in *)(addr + 1))->sin_addr.s_addr, + sa->sadb_sa_spi, proto); break; #if defined(CONFIG_IPV6) || defined(CONFIG_IPV6_MODULE) case AF_INET6: @@ -1097,18 +1097,7 @@ min_spi = htonl(0x100); max_spi = htonl(0x0fffffff); } - switch (x->props.family) { - case AF_INET: - xfrm_alloc_spi(x, min_spi, max_spi); - break; -#if defined(CONFIG_IPV6) || defined(CONFIG_IPV6_MODULE) - case AF_INET6: - xfrm6_alloc_spi(x, min_spi, max_spi); - break; -#endif - default: - break; - } + xfrm_alloc_spi(x, min_spi, max_spi); if (x->id.spi) resp_skb = pfkey_xfrm_state2msg(x, 0, 3); } diff -ruN -x CVS linux-2.5.63/net/netsyms.c linux25/net/netsyms.c --- linux-2.5.63/net/netsyms.c 2003-02-25 04:05:16.000000000 +0900 +++ linux25/net/netsyms.c 2003-03-04 20:38:15.000000000 +0900 @@ -296,11 +296,11 @@ EXPORT_SYMBOL(__xfrm_route_forward); EXPORT_SYMBOL(xfrm_state_alloc); EXPORT_SYMBOL(__xfrm_state_destroy); -EXPORT_SYMBOL(xfrm_state_find); +EXPORT_SYMBOL(xfrm4_state_find); EXPORT_SYMBOL(xfrm_state_insert); EXPORT_SYMBOL(xfrm_state_check_expire); EXPORT_SYMBOL(xfrm_state_check_space); -EXPORT_SYMBOL(xfrm_state_lookup); +EXPORT_SYMBOL(xfrm4_state_lookup); EXPORT_SYMBOL(xfrm_replay_check); EXPORT_SYMBOL(xfrm_replay_advance); EXPORT_SYMBOL(xfrm_check_selectors); @@ -324,13 +324,17 @@ EXPORT_SYMBOL(xfrm_policy_flush); EXPORT_SYMBOL(xfrm_policy_byid); EXPORT_SYMBOL(xfrm_policy_list); +EXPORT_SYMBOL(xfrm_dst_lookup_register); +EXPORT_SYMBOL(xfrm_dst_lookup_unregister); #if defined(CONFIG_IPV6) || defined(CONFIG_IPV6_MODULE) +EXPORT_SYMBOL(xfrm6_state_find); +EXPORT_SYMBOL(xfrm6_rcv); EXPORT_SYMBOL(xfrm6_state_lookup); EXPORT_SYMBOL(xfrm6_find_acq); -EXPORT_SYMBOL(xfrm6_alloc_spi); EXPORT_SYMBOL(xfrm6_register_type); EXPORT_SYMBOL(xfrm6_unregister_type); EXPORT_SYMBOL(xfrm6_get_type); +EXPORT_SYMBOL(xfrm6_clear_mutable_options); #endif EXPORT_SYMBOL_GPL(xfrm_probe_algs); @@ -342,6 +346,15 @@ EXPORT_SYMBOL_GPL(xfrm_ealg_get_byid); EXPORT_SYMBOL_GPL(xfrm_aalg_get_byname); EXPORT_SYMBOL_GPL(xfrm_ealg_get_byname); +#if defined(CONFIG_INET_AH) || defined(CONFIG_INET_AH_MODULE) || defined(CONFIG_INET6_AH) || defined(CONFIG_INET6_AH_MODULE) +EXPORT_SYMBOL_GPL(skb_ah_walk); +#endif +#if defined(CONFIG_INET_ESP) || defined(CONFIG_INET_ESP_MODULE) || defined(CONFIG_INET6_ESP) || defined(CONFIG_INET6_ESP_MODULE) +EXPORT_SYMBOL_GPL(skb_cow_data); +EXPORT_SYMBOL_GPL(pskb_put); +EXPORT_SYMBOL_GPL(skb_icv_walk); +EXPORT_SYMBOL_GPL(skb_to_sgvec); +#endif #if defined (CONFIG_IPV6_MODULE) || defined (CONFIG_IP_SCTP_MODULE) /* inet functions common to v4 and v6 */ From warlord@MIT.EDU Wed Mar 5 06:52:49 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 05 Mar 2003 06:52:51 -0800 (PST) Received: from fort-point-station.mit.edu (FORT-POINT-STATION.MIT.EDU [18.7.7.76]) by oss.sgi.com (8.11.6/8.11.6) with SMTP id h25Eqmf18604 for ; Wed, 5 Mar 2003 06:52:48 -0800 Received: from grand-central-station.mit.edu (GRAND-CENTRAL-STATION.MIT.EDU [18.7.21.82]) by fort-point-station.mit.edu (8.9.2/8.9.2) with ESMTP id JAA01806; Wed, 5 Mar 2003 09:52:46 -0500 (EST) Received: from manawatu-mail-centre.mit.edu (MANAWATU-MAIL-CENTRE.MIT.EDU [18.7.7.71]) by grand-central-station.mit.edu (8.9.2/8.9.2) with ESMTP id JAA08319; Wed, 5 Mar 2003 09:52:45 -0500 (EST) Received: from kikki.mit.edu (KIKKI.MIT.EDU [18.18.1.142]) ) by manawatu-mail-centre.mit.edu (8.12.4/8.12.4) with ESMTP id h25Eqi6g020217; Wed, 5 Mar 2003 09:52:44 -0500 (EST) Received: (from warlord@localhost) by kikki.mit.edu (8.9.3) id JAA26295; Wed, 5 Mar 2003 09:52:44 -0500 (EST) To: bert hubert Cc: Andreas Jellinghaus , linux-kernel@vger.kernel.org, netdev@oss.sgi.com Subject: Re: ipsec-tools 0.1 + kernel 2.5.64 References: <1046863752.441.7.camel@simulacron> <20030305112852.GA22351@outpost.ds9a.nl> From: Derek Atkins Date: 05 Mar 2003 09:52:44 -0500 In-Reply-To: <20030305112852.GA22351@outpost.ds9a.nl> Message-ID: User-Agent: Gnus/5.0808 (Gnus v5.8.8) Emacs/20.7 MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-archive-position: 1863 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: warlord@MIT.EDU Precedence: bulk X-list: netdev Content-Length: 1291 Lines: 37 bert hubert writes: > On Wed, Mar 05, 2003 at 12:29:12PM +0100, Andreas Jellinghaus wrote: > > Hi, > > > > both manual keying and automatic keying with racoon (pre-shared secret) > > are working fine. No need to patch or modify anything. > > I tried only ipv4. > > By the way, regarding ipsec-tools 0.1, are you sure you want to fork the > projects involved? I spoke to the KAME people and unfortunately, at least for now, there is no other choice but to fork. Perhaps down the road we can merge, but as of last week they don't want to host a linux package. They are willing to take some of our patches, but that doesn't help with a build system. > By the way, you did not mention it here but ipsec-tools is available on > http://sourceforge.net/projects/ipsec-tools , I also link them from > http://lartc.org/howto/lartc.ipsec.html I didn't? Perhaps I said ipsec-tool.sourceforge.net which has a link to sourceforge.net/projects/ipsec-tools and is much shorter to type. ;) > Regards, > > bert -derek -- Derek Atkins, SB '93 MIT EE, SM '95 MIT Media Laboratory Member, MIT Student Information Processing Board (SIPB) URL: http://web.mit.edu/warlord/ PP-ASEL-IA N1NWH warlord@MIT.EDU PGP key available From davem@redhat.com Wed Mar 5 07:40:30 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 05 Mar 2003 07:40:34 -0800 (PST) Received: from pizda.ninka.net (pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.11.6/8.11.6) with SMTP id h25FeTf19745 for ; Wed, 5 Mar 2003 07:40:30 -0800 Received: from localhost (IDENT:davem@localhost.localdomain [127.0.0.1]) by pizda.ninka.net (8.9.3/8.9.3) with ESMTP id HAA16030; Wed, 5 Mar 2003 07:21:50 -0800 Date: Wed, 05 Mar 2003 07:21:49 -0800 (PST) Message-Id: <20030305.072149.121185037.davem@redhat.com> To: kazunori@miyazawa.org Cc: kuznet@ms2.inr.ac.ru, linux-kernel@vger.kernel.org, netdev@oss.sgi.com, usagi-core@linux-ipv6.org Subject: Re: [PATH] IPv6 IPsec support From: "David S. Miller" In-Reply-To: <20030305233025.784feb00.kazunori@miyazawa.org> References: <20030305233025.784feb00.kazunori@miyazawa.org> X-FalunGong: Information control. X-Mailer: Mew version 2.1 on Emacs 21.1 / Mule 5.0 (SAKAKI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 1864 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev Content-Length: 588 Lines: 21 From: Kazunori Miyazawa Date: Wed, 5 Mar 2003 23:30:25 +0900 Hello Miyazawa-san, I submit the patch to let the kernel support ipv6 ipsec again. It is able to comple ipv6 as module. This patch incldes a couple of clean-up and changes of function name. Excellent work. I have comments, but they are very minor and can wait. I will apply your patch after basic build testing. The next large task will be to abstract out more common pieces of code. There is still quite a bit of code duplication between v4 and v6 xfrm methods, Thank you! From yoshfuji@linux-ipv6.org Wed Mar 5 07:48:23 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 05 Mar 2003 07:48:26 -0800 (PST) Received: from yue.hongo.wide.ad.jp (yue.hongo.wide.ad.jp [203.178.139.94]) by oss.sgi.com (8.11.6/8.11.6) with SMTP id h25FmNf20276 for ; Wed, 5 Mar 2003 07:48:23 -0800 Received: from localhost (localhost [127.0.0.1]) by yue.hongo.wide.ad.jp (8.12.3+3.5Wbeta/8.12.3/Debian-5) with ESMTP id h25FmMUl009623; Thu, 6 Mar 2003 00:48:22 +0900 Date: Thu, 06 Mar 2003 00:48:20 +0900 (JST) Message-Id: <20030306.004820.41101302.yoshfuji@linux-ipv6.org> To: davem@redhat.com Cc: kazunori@miyazawa.org, kuznet@ms2.inr.ac.ru, linux-kernel@vger.kernel.org, netdev@oss.sgi.com, usagi@linux-ipv6.org Subject: Re: (usagi-core 12294) Re: [PATCH] IPv6 IPsec support From: YOSHIFUJI Hideaki / =?iso-2022-jp?B?GyRCNUhGIzFRTEAbKEI=?= In-Reply-To: <20030305.072149.121185037.davem@redhat.com> References: <20030305233025.784feb00.kazunori@miyazawa.org> <20030305.072149.121185037.davem@redhat.com> Organization: USAGI Project X-URL: http://www.yoshifuji.org/%7Ehideaki/ X-Fingerprint: 90 22 65 EB 1E CF 3A D1 0B DF 80 D8 48 07 F8 94 E0 62 0E EA X-PGP-Key-URL: http://www.yoshifuji.org/%7Ehideaki/hideaki@yoshifuji.org.asc X-Face: "5$Al-.M>NJ%a'@hhZdQm:."qn~PA^gq4o*>iCFToq*bAi#4FRtx}enhuQKz7fNqQz\BYU] $~O_5m-9'}MIs`XGwIEscw;e5b>n"B_?j/AkL~i/MEaZBLP X-Mailer: Mew version 2.2 on Emacs 20.7 / Mule 4.1 (AOI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 1865 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: yoshfuji@linux-ipv6.org Precedence: bulk X-list: netdev Content-Length: 482 Lines: 14 In article <20030305.072149.121185037.davem@redhat.com> (at Wed, 05 Mar 2003 07:21:49 -0800 (PST)), "David S. Miller" says: > I will apply your patch after basic build testing. Thank you. > The next large task will be to abstract out more common > pieces of code. There is still quite a bit of code duplication > between v4 and v6 xfrm methods, Yes, we will do that. That patch is first step for reducing duplicate codes between IPv4 and IPv6. --yoshfuji From mcmanus@datapower.ducksong.com Wed Mar 5 13:00:54 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 05 Mar 2003 13:00:57 -0800 (PST) Received: from datapower.ducksong.com (ip67-93-141-186.z141-93-67.customer.algx.net [67.93.141.186]) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h25L0n40027680 for ; Wed, 5 Mar 2003 13:00:53 -0800 Received: (from mcmanus@localhost) by datapower.ducksong.com (8.11.6/8.11.6) id h25L0ms10895 for netdev@oss.sgi.com; Wed, 5 Mar 2003 16:00:48 -0500 Date: Wed, 5 Mar 2003 16:00:48 -0500 From: "Patrick R. McManus" To: netdev@oss.sgi.com Subject: SIOCETHTOOL ioctl() and a corrupted cmd argument Message-ID: <20030305210047.GA10824@ducksong.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.4i X-archive-position: 1866 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: mcmanus@ducksong.com Precedence: bulk X-list: netdev Content-Length: 1303 Lines: 45 Hello, this is odd. My problem is with the cmd argument to a driver's ioctl() handler getting modified when the caller is non root. I have a 2.4.19era kernel and am running the e1000 driver, as a module, from the 2.4.20 kernel. (drivers previous to 4.4.12 tended to keep resetting themselves on me.) my userspace code make a call that looks like this struct ethtool_cmd ec; int fd; int rv = -1; memset (&ifr,0,sizeof(ifr)); strncpy (ifr.ifr_name, getName(),IFNAMSIZ); fd = socket (PF_INET,SOCK_DGRAM,0); ifr.ifr_data = (char *) &ec; ec.cmd = ETHTOOL_GSET; fprintf (stderr,"SIOCETHTOOL is %X\n",SIOCETHTOOL); if (ioctl(fd, SIOCETHTOOL, &ifr) >=0) stderr always prints: SIOCETHTOOL is 8946 when I run the userspace code as root the ioctl succeeds, when I run it as an unpriv'd user it fails. So I annotated the driver by adding to e1000_ioctl: printk(KERN_INFO "general ioctl cmd %X, magic %X\n",cmd,SIOCETHTOOL); as root I get the expected Mar 5 15:53:33 mcmanus kernel: general ioctl cmd 8946, magic 8946 as a regular user I get Mar 5 15:46:57 mcmanus kernel: general ioctl cmd 89F0, magic 8946 can someone help me with the chain to look at for why the cmd value might be getting modified? -Patrick From aj@dungeon.inka.de Wed Mar 5 13:24:38 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 05 Mar 2003 13:24:41 -0800 (PST) Received: from mail.inka.de (mail@quechua.inka.de [193.197.184.2]) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h25LN540031655 for ; Wed, 5 Mar 2003 13:24:38 -0800 Received: from dungeon.inka.de (uucp@[127.0.0.1]) by mail.inka.de with uucp (rmailwrap 0.5) id 18qgLv-0007Je-00; Wed, 05 Mar 2003 22:23:03 +0100 Received: from [192.168.0.10] (unknown [192.168.0.10]) by dungeon.inka.de (Postfix) with ESMTP id 154B620E4F; Wed, 5 Mar 2003 22:22:59 +0100 (CET) Subject: Re: ipsec-tools 0.1 + kernel 2.5.64 From: Andreas Jellinghaus Cc: linux-kernel@vger.kernel.org, netdev@oss.sgi.com In-Reply-To: <20030305112852.GA22351@outpost.ds9a.nl> References: <1046863752.441.7.camel@simulacron> <20030305112852.GA22351@outpost.ds9a.nl> Content-Type: text/plain Organization: Message-Id: <1046899726.440.0.camel@simulacron> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.2.1 Date: 05 Mar 2003 22:28:46 +0100 Content-Transfer-Encoding: 7bit X-archive-position: 1867 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: aj@dungeon.inka.de Precedence: bulk X-list: netdev Content-Length: 134 Lines: 7 it's working fine with win 2k pro (ipsec, 3des, sha, pre shared key). I will try to write something useful for the howto. Andreas From mcmanus@datapower.ducksong.com Wed Mar 5 13:32:07 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 05 Mar 2003 13:32:09 -0800 (PST) Received: from datapower.ducksong.com (ip67-93-141-189.z141-93-67.customer.algx.net [67.93.141.189]) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h25LW540032151 for ; Wed, 5 Mar 2003 13:32:06 -0800 Received: (from mcmanus@localhost) by datapower.ducksong.com (8.11.6/8.11.6) id h25LW5g02187 for netdev@oss.sgi.com; Wed, 5 Mar 2003 16:32:05 -0500 Date: Wed, 5 Mar 2003 16:32:05 -0500 From: "Patrick R. McManus" To: netdev@oss.sgi.com Subject: Re: SIOCETHTOOL ioctl() and a corrupted cmd argument Message-ID: <20030305213205.GA1227@ducksong.com> References: <20030305210047.GA10824@ducksong.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20030305210047.GA10824@ducksong.com> User-Agent: Mutt/1.4i X-archive-position: 1868 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: mcmanus@ducksong.com Precedence: bulk X-list: netdev Content-Length: 869 Lines: 24 [Patrick R. McManus: Mar 05 16:00] > as a regular user I get > Mar 5 15:46:57 mcmanus kernel: general ioctl cmd 89F0, magic 8946 > turns out, as I had expected, my report is bogus.. this ioctl is a fallback after the siocethtool fails. the driver do_ioctl() never gets invoked at all when the ioctl() is invoked without being root. this would be because in net/core/dev.c dev_ioctl() they are filtered out: case SIOCETHTOOL: case SIOCGMIIPHY: case SIOCGMIIREG: if (!capable(CAP_NET_ADMIN)) return -EPERM; but SIOCETHTOOL shouldn't need perms, right? it has some functionality that needs it and some that doesn't, and the driver sorts it out.. there isn't a GIOCETHTOOL at all.. #define ETHTOOL_GSET 0x00000001 /* Get settings. */ #define ETHTOOL_SSET 0x00000002 /* Set settings, privileged. */ From garzik@gtf.org Wed Mar 5 13:42:03 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 05 Mar 2003 13:42:05 -0800 (PST) Received: from havoc.gtf.org (havoc.daloft.com [64.213.145.173]) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h25Lg1Ru032677 for ; Wed, 5 Mar 2003 13:42:03 -0800 Received: by havoc.gtf.org (Postfix, from userid 500) id 02824663B; Wed, 5 Mar 2003 16:41:55 -0500 (EST) Date: Wed, 5 Mar 2003 16:41:55 -0500 From: Jeff Garzik To: "Patrick R. McManus" Cc: netdev@oss.sgi.com Subject: Re: SIOCETHTOOL ioctl() and a corrupted cmd argument Message-ID: <20030305214155.GM13420@gtf.org> References: <20030305210047.GA10824@ducksong.com> <20030305213205.GA1227@ducksong.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20030305213205.GA1227@ducksong.com> User-Agent: Mutt/1.3.28i X-archive-position: 1869 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jgarzik@pobox.com Precedence: bulk X-list: netdev Content-Length: 750 Lines: 21 On Wed, Mar 05, 2003 at 04:32:05PM -0500, Patrick R. McManus wrote: > but SIOCETHTOOL shouldn't need perms, right? it has some functionality > that needs it and some that doesn't, and the driver sorts it > out.. there isn't a GIOCETHTOOL at all.. > > #define ETHTOOL_GSET 0x00000001 /* Get settings. */ > #define ETHTOOL_SSET 0x00000002 /* Set settings, privileged. */ You are correct that comment is misleading... all ethtool does current requiring CAP_NET_ADMIN. This is one of the costs of lumping things under one ioctl, rather than constantly using new ioctls. It is certainly possible (and reasonable) that a future kernel peeks at the ioctl and then conditionally checks privs, but this is not currently the case. Jeff From bunk@fs.tum.de Wed Mar 5 14:55:32 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 05 Mar 2003 14:55:38 -0800 (PST) Received: from hermes.fachschaften.tu-muenchen.de (hermes.fachschaften.tu-muenchen.de [129.187.202.12]) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h25Msoq9006687 for ; Wed, 5 Mar 2003 14:55:31 -0800 Received: (qmail 1254 invoked from network); 5 Mar 2003 22:54:44 -0000 Received: from mimas.fachschaften.tu-muenchen.de (129.187.202.58) by hermes.fachschaften.tu-muenchen.de with QMQP; 5 Mar 2003 22:54:44 -0000 Date: Wed, 5 Mar 2003 23:54:41 +0100 From: Adrian Bunk To: davem@redhat.com, netdev@oss.sgi.com Cc: linux-kernel@vger.kernel.org Subject: Chaotic structure of the net headers? Message-ID: <20030305225441.GO20423@fs.tum.de> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.4i X-archive-position: 1870 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: bunk@fs.tum.de Precedence: bulk X-list: netdev Content-Length: 2072 Lines: 60 Hi, if all I'm describing is completely logical and I'm only too dumb to see the logic please forgive me. ;-) In 2.5.64 there are networking headers both under include/linux/ and include/net/. I don't understand whether there's a deeper logic why e.g. the netfilter headers are under include/linux/. There's some duplication, e.g. include/linux/in6.h contains <-- snip --> /* * IPV6 extension headers */ #define IPPROTO_HOPOPTS 0 /* IPv6 hop-by-hop options */ #define IPPROTO_ROUTING 43 /* IPv6 routing header */ #define IPPROTO_FRAGMENT 44 /* IPv6 fragmentation header */ #define IPPROTO_ICMPV6 58 /* ICMPv6 */ #define IPPROTO_NONE 59 /* IPv6 no next header */ #define IPPROTO_DSTOPTS 60 /* IPv6 destination options */ <-- snip --> and include/net/ipv6.h contains: <-- snip --> /* * NextHeader field of IPv6 header */ #define NEXTHDR_HOP 0 /* Hop-by-hop option header. */ #define NEXTHDR_TCP 6 /* TCP segment. */ #define NEXTHDR_UDP 17 /* UDP message. */ #define NEXTHDR_IPV6 41 /* IPv6 in IPv6 */ #define NEXTHDR_ROUTING 43 /* Routing header. */ #define NEXTHDR_FRAGMENT 44 /* Fragmentation/reassembly header. */ #define NEXTHDR_ESP 50 /* Encapsulating security payload. */ #define NEXTHDR_AUTH 51 /* Authentication header. */ #define NEXTHDR_ICMP 58 /* ICMP for IPv6. */ #define NEXTHDR_NONE 59 /* No next header */ #define NEXTHDR_DEST 60 /* Destination options header. */ <-- snip --> Two different #define's for the same thing doesn't sound like a good idea? cu Adrian -- "Is there not promise of rain?" Ling Tan asked suddenly out of the darkness. There had been need of rain for many days. "Only a promise," Lao Er said. Pearl S. Buck - Dragon Seed From davem@redhat.com Wed Mar 5 14:58:56 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 05 Mar 2003 14:58:58 -0800 (PST) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h25Mwtq9007077 for ; Wed, 5 Mar 2003 14:58:56 -0800 Received: from localhost (IDENT:davem@localhost.localdomain [127.0.0.1]) by pizda.ninka.net (8.9.3/8.9.3) with ESMTP id OAA17413; Wed, 5 Mar 2003 14:39:51 -0800 Date: Wed, 05 Mar 2003 14:39:51 -0800 (PST) Message-Id: <20030305.143951.118510613.davem@redhat.com> To: bunk@fs.tum.de Cc: netdev@oss.sgi.com, linux-kernel@vger.kernel.org Subject: Re: Chaotic structure of the net headers? From: "David S. Miller" In-Reply-To: <20030305225441.GO20423@fs.tum.de> References: <20030305225441.GO20423@fs.tum.de> X-FalunGong: Information control. X-Mailer: Mew version 2.1 on Emacs 21.1 / Mule 5.0 (SAKAKI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 1871 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev Content-Length: 222 Lines: 8 From: Adrian Bunk Date: Wed, 5 Mar 2003 23:54:41 +0100 Two different #define's for the same thing doesn't sound like a good idea? Required by the ipv6 advanced sockets API I do believe. From Rod.VanMeter@nokia.com Wed Mar 5 15:18:56 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 05 Mar 2003 15:18:58 -0800 (PST) Received: from mailhost.iprg.nokia.com (mailhost.iprg.nokia.com [205.226.5.12]) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h25NIsq9008581 for ; Wed, 5 Mar 2003 15:18:55 -0800 Received: from darkstar.iprg.nokia.com (darkstar.iprg.nokia.com [205.226.5.69]) by mailhost.iprg.nokia.com (8.9.3/8.9.3-GLGS) with ESMTP id PAA15361; Wed, 5 Mar 2003 15:18:42 -0800 (PST) Received: (from root@localhost) by darkstar.iprg.nokia.com (8.11.0/8.11.0-DARKSTAR) id h25NIf430270; Wed, 5 Mar 2003 15:18:41 -0800 X-mProtect: <200303052318> Nokia Silicon Valley Messaging Protection Received: from UNKNOWN (172.19.68.126, claiming to be "dadhcp-172019068126.americas.nokia.com") by darkstar.iprg.nokia.com smtpd7QgrwF; Wed, 05 Mar 2003 15:18:39 PST Subject: Re: Chaotic structure of the net headers? From: Rod Van Meter Reply-To: Rod.VanMeter@nokia.com To: ext Adrian Bunk Cc: davem@redhat.com, netdev@oss.sgi.com, linux-kernel@vger.kernel.org In-Reply-To: <20030305225441.GO20423@fs.tum.de> References: <20030305225441.GO20423@fs.tum.de> Content-Type: text/plain Organization: Nokia Networks Message-Id: <1046905834.17778.400.camel@localhost.localdomain> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.2.2 Date: 05 Mar 2003 15:10:35 -0800 Content-Transfer-Encoding: 7bit X-archive-position: 1873 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: Rod.VanMeter@nokia.com Precedence: bulk X-list: netdev On Wed, 2003-03-05 at 14:54, ext Adrian Bunk wrote: > > There's some duplication, e.g. include/linux/in6.h contains > > /* > * IPV6 extension headers > */ > #define IPPROTO_HOPOPTS 0 /* IPv6 hop-by-hop options */ > #define IPPROTO_ROUTING 43 /* IPv6 routing header */ > #define IPPROTO_FRAGMENT 44 /* IPv6 fragmentation header */ > #define IPPROTO_ICMPV6 58 /* ICMPv6 */ > #define IPPROTO_NONE 59 /* IPv6 no next header */ > #define IPPROTO_DSTOPTS 60 /* IPv6 destination options */ According to RFC2292 (Advanced Sockets): 2.1.1. IPv6 Next Header Values IPv6 defines many new values for the Next Header field. The following constants are defined as a result of including . #define IPPROTO_HOPOPTS 0 /* IPv6 Hop-by-Hop options */ #define IPPROTO_IPV6 41 /* IPv6 header */ #define IPPROTO_ROUTING 43 /* IPv6 Routing header */ #define IPPROTO_FRAGMENT 44 /* IPv6 fragmentation header */ #define IPPROTO_ESP 50 /* encapsulating security payload */ #define IPPROTO_AH 51 /* authentication header */ #define IPPROTO_ICMPV6 58 /* ICMPv6 */ #define IPPROTO_NONE 59 /* IPv6 no next header */ #define IPPROTO_DSTOPTS 60 /* IPv6 Destination options */ Berkeley-derived IPv4 implementations also define IPPROTO_IP to be 0. This should not be a problem since IPPROTO_IP is used only with IPv4 sockets and IPPROTO_HOPOPTS only with IPv6 sockets. > > and include/net/ipv6.h contains: > > <-- snip --> > > /* > * NextHeader field of IPv6 header > */ > > #define NEXTHDR_HOP 0 /* Hop-by-hop option header. */ > #define NEXTHDR_TCP 6 /* TCP segment. */ > #define NEXTHDR_UDP 17 /* UDP message. */ > #define NEXTHDR_IPV6 41 /* IPv6 in IPv6 */ > #define NEXTHDR_ROUTING 43 /* Routing header. */ > #define NEXTHDR_FRAGMENT 44 /* Fragmentation/reassembly header. */ This form doesn't appear in RFC2292, nor in 2133 (Basic Socket...) My interpretation is that this latter form is defined for kernel use, while the former is for user-level manipulation of raw packet fields (the primary purpose of 2292). Does it make sense to have two forms, one kernel, one user? I haven't e.g. followed the desired include chain. If we wanted to merge the uses, the former form and include location would probably have to be used. I've been looking into this. There are a *few* things missing from the 2292 support. AFAICT, it's just a handful of functions/macros for manipulating option headers that need to be added. Does anybody actually USE this stuff (the advanced sockets API, I mean, not IPv6)? I'm planning to add those missing bits, just for kicks, but haven't done it yet. --Rod From davem@redhat.com Wed Mar 5 15:23:31 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 05 Mar 2003 15:23:33 -0800 (PST) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h25NMpq9008942 for ; Wed, 5 Mar 2003 15:23:31 -0800 Received: from localhost (IDENT:davem@localhost.localdomain [127.0.0.1]) by pizda.ninka.net (8.9.3/8.9.3) with ESMTP id PAA17467; Wed, 5 Mar 2003 15:03:44 -0800 Date: Wed, 05 Mar 2003 15:03:44 -0800 (PST) Message-Id: <20030305.150344.50145701.davem@redhat.com> To: Rod.VanMeter@nokia.com Cc: bunk@fs.tum.de, netdev@oss.sgi.com, linux-kernel@vger.kernel.org Subject: Re: Chaotic structure of the net headers? From: "David S. Miller" In-Reply-To: <1046905834.17778.400.camel@localhost.localdomain> References: <20030305225441.GO20423@fs.tum.de> <1046905834.17778.400.camel@localhost.localdomain> X-FalunGong: Information control. X-Mailer: Mew version 2.1 on Emacs 21.1 / Mule 5.0 (SAKAKI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 1874 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev From: Rod Van Meter Date: 05 Mar 2003 15:10:35 -0800 Does it make sense to have two forms, one kernel, one user? I haven't e.g. followed the desired include chain. If we wanted to merge the uses, the former form and include location would probably have to be used. I've been looking into this. There are a *few* things missing from the 2292 support. AFAICT, it's just a handful of functions/macros for manipulating option headers that need to be added. Actually forget all my comments, GLIBC headers are where the advanced socket API requirements for headers should be applied. And since this is only used in the kernel, there is no need for the NEXTHDR_* if it trully just duplicates the IPPROTO_* defines. I'm willing to accept a cleanup patch of this nature, sure. From davem@redhat.com Wed Mar 5 15:44:12 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 05 Mar 2003 15:44:15 -0800 (PST) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h25NiBq9009621 for ; Wed, 5 Mar 2003 15:44:12 -0800 Received: from localhost (IDENT:davem@localhost.localdomain [127.0.0.1]) by pizda.ninka.net (8.9.3/8.9.3) with ESMTP id PAA17543; Wed, 5 Mar 2003 15:25:31 -0800 Date: Wed, 05 Mar 2003 15:25:30 -0800 (PST) Message-Id: <20030305.152530.70806720.davem@redhat.com> To: kazunori@miyazawa.org Cc: kuznet@ms2.inr.ac.ru, linux-kernel@vger.kernel.org, netdev@oss.sgi.com, usagi-core@linux-ipv6.org Subject: Re: [PATH] IPv6 IPsec support From: "David S. Miller" In-Reply-To: <20030305233025.784feb00.kazunori@miyazawa.org> References: <20030305233025.784feb00.kazunori@miyazawa.org> X-FalunGong: Information control. X-Mailer: Mew version 2.1 on Emacs 21.1 / Mule 5.0 (SAKAKI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 1875 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev From: Kazunori Miyazawa Date: Wed, 5 Mar 2003 23:30:25 +0900 Hello Miyazawa-san, I submit the patch to let the kernel support ipv6 ipsec again. It is able to comple ipv6 as module. As promised I applied the patch. I will push it to Linus later this evening, or tomorrow. In this initial checkin I made only 2 minor fixes, they are attached below: --- ./include/net/ip6_route.h.~1~ Wed Mar 5 15:32:41 2003 +++ ./include/net/ip6_route.h Wed Mar 5 15:40:42 2003 @@ -38,7 +38,6 @@ extern int ipv6_route_ioctl(unsigned int cmd, void *arg); extern int ip6_route_add(struct in6_rtmsg *rtmsg); -extern int ip6_route_del(struct in6_rtmsg *rtmsg); extern int ip6_del_rt(struct rt6_info *); extern int ip6_rt_addr_add(struct in6_addr *addr, --- ./net/ipv6/Kconfig.~1~ Wed Mar 5 15:32:41 2003 +++ ./net/ipv6/Kconfig Wed Mar 5 15:35:27 2003 @@ -19,6 +19,7 @@ config INET6_AH tristate "IPv6: AH transformation" + depends on IPV6 ---help--- Support for IPsec AH. @@ -26,6 +27,7 @@ config INET6_ESP tristate "IPv6: ESP transformation" + depends on IPV6 ---help--- Support for IPsec ESP. From davem@redhat.com Wed Mar 5 15:59:48 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 05 Mar 2003 15:59:52 -0800 (PST) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h25Nxlq9010071 for ; Wed, 5 Mar 2003 15:59:48 -0800 Received: from localhost (IDENT:davem@localhost.localdomain [127.0.0.1]) by pizda.ninka.net (8.9.3/8.9.3) with ESMTP id PAA17576; Wed, 5 Mar 2003 15:41:01 -0800 Date: Wed, 05 Mar 2003 15:41:00 -0800 (PST) Message-Id: <20030305.154100.28816301.davem@redhat.com> To: yoshfuji@linux-ipv6.org Cc: kazunori@miyazawa.org, kuznet@ms2.inr.ac.ru, linux-kernel@vger.kernel.org, netdev@oss.sgi.com, usagi@linux-ipv6.org Subject: Re: (usagi-core 12294) Re: [PATCH] IPv6 IPsec support From: "David S. Miller" In-Reply-To: <20030306.004820.41101302.yoshfuji@linux-ipv6.org> References: <20030305233025.784feb00.kazunori@miyazawa.org> <20030305.072149.121185037.davem@redhat.com> <20030306.004820.41101302.yoshfuji@linux-ipv6.org> X-FalunGong: Information control. X-Mailer: Mew version 2.1 on Emacs 21.1 / Mule 5.0 (SAKAKI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=iso-2022-jp Content-Transfer-Encoding: 7bit X-archive-position: 1876 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev From: YOSHIFUJI Hideaki / $B5HF#1QL@(B Date: Thu, 06 Mar 2003 00:48:20 +0900 (JST) > The next large task will be to abstract out more common > pieces of code. There is still quite a bit of code duplication > between v4 and v6 xfrm methods, Yes, we will do that. That patch is first step for reducing duplicate codes between IPv4 and IPv6. Great. I believe it should be possible, in the end, to make the XFRM engine %100 address-family (v4, v6 etc.) and protocol (ah, esp) independant. If that goal is achieved, we may move generic parts from net/ipv4/xfrm_*.c to net/xfrm_*.c Note that this coincides with the idea to eventually have an address-family independant flow cache. Most of the address-family specific areas are: 1) DST lookup (xfrm_dst_lookup_t) 2) selector key comparisons and state lookup (xfrm$(AF)_selector_match, xfrm$(AF)_state_find) 3) receive processing (xfrm${AF}_rcv) #1 is made for ipv6 by Miyazawa-san's patch. This could logically be extended to handle issues #2 and #3 above. All protocol specific (ESP, AH) and address-family specific references should go away from places like include/net/xfrm.h I think you understand all of this, and therefore I cannot wait for the next ipsec cleanup patch from you :) Finally, note that eventually we will need some reference counting scheme for to allow xfrm address-family modules to be unloaded safely. Currently, ipv4 cannot be a module and ipv6 as a module is not able to unload :-) So the module unload problem does not exist right at this moment. So ignore this issue for now. From rgb@conscoop.ottawa.on.ca Wed Mar 5 16:06:34 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 05 Mar 2003 16:06:37 -0800 (PST) Received: from conscoop.ottawa.on.ca (cpu2747.adsl.bellglobal.com [207.236.55.216]) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h2605qq9010510 for ; Wed, 5 Mar 2003 16:06:34 -0800 Received: (from rgb@localhost) by conscoop.ottawa.on.ca (8.12.0.Beta5/8.11.6) id h25NrKZG008496; Wed, 5 Mar 2003 18:53:20 -0500 Date: Wed, 5 Mar 2003 18:53:20 -0500 From: Richard Guy Briggs To: Rod Van Meter Cc: ext Adrian Bunk , davem@redhat.com, netdev@oss.sgi.com, linux-kernel@vger.kernel.org Subject: Re: Chaotic structure of the net headers? Message-ID: <20030305185320.I4305@grendel.conscoop.ottawa.on.ca> References: <20030305225441.GO20423@fs.tum.de> <1046905834.17778.400.camel@localhost.localdomain> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.2.5.1i In-Reply-To: <1046905834.17778.400.camel@localhost.localdomain>; from Rod.VanMeter@nokia.com on Wed, Mar 05, 2003 at 03:10:35PM -0800 X-archive-position: 1877 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: rgb@conscoop.ottawa.on.ca Precedence: bulk X-list: netdev On Wed, Mar 05, 2003 at 03:10:35PM -0800, Rod Van Meter wrote: > On Wed, 2003-03-05 at 14:54, ext Adrian Bunk wrote: > > There's some duplication, e.g. include/linux/in6.h contains > > > > /* > > * IPV6 extension headers > > */ > > #define IPPROTO_HOPOPTS 0 /* IPv6 hop-by-hop options */ > > #define IPPROTO_ROUTING 43 /* IPv6 routing header */ > > #define IPPROTO_FRAGMENT 44 /* IPv6 fragmentation header */ > > #define IPPROTO_ICMPV6 58 /* ICMPv6 */ > > #define IPPROTO_NONE 59 /* IPv6 no next header */ > > #define IPPROTO_DSTOPTS 60 /* IPv6 destination options */ > > > According to RFC2292 (Advanced Sockets): > > 2.1.1. IPv6 Next Header Values > > IPv6 defines many new values for the Next Header field. The > following constants are defined as a result of including > . > > #define IPPROTO_HOPOPTS 0 /* IPv6 Hop-by-Hop options */ > #define IPPROTO_IPV6 41 /* IPv6 header */ > #define IPPROTO_ROUTING 43 /* IPv6 Routing header */ > #define IPPROTO_FRAGMENT 44 /* IPv6 fragmentation header */ > #define IPPROTO_ESP 50 /* encapsulating security payload */ > #define IPPROTO_AH 51 /* authentication header */ > #define IPPROTO_ICMPV6 58 /* ICMPv6 */ > #define IPPROTO_NONE 59 /* IPv6 no next header */ > #define IPPROTO_DSTOPTS 60 /* IPv6 Destination options */ > > Berkeley-derived IPv4 implementations also define IPPROTO_IP to be 0. > This should not be a problem since IPPROTO_IP is used only with IPv4 > sockets and IPPROTO_HOPOPTS only with IPv6 sockets. The Linux FreeS/WAN IPsec implementation has been using IPPROTO_ESP, IPPROTO_AH, IPPROTO_INT (61, put aside by IANA for internal use), IPPROTO_COMP (108), IPPROTO_IPIP (4) for the last 5 years, based on common usage and examples such as IPPROTO_UDP, IPPROTO_TCP, IPPROTO_ICMP. > > and include/net/ipv6.h contains: > > > > <-- snip --> > > > > /* > > * NextHeader field of IPv6 header > > */ > > > > #define NEXTHDR_HOP 0 /* Hop-by-hop option header. */ > > #define NEXTHDR_TCP 6 /* TCP segment. */ > > #define NEXTHDR_UDP 17 /* UDP message. */ > > #define NEXTHDR_IPV6 41 /* IPv6 in IPv6 */ > > #define NEXTHDR_ROUTING 43 /* Routing header. */ > > #define NEXTHDR_FRAGMENT 44 /* Fragmentation/reassembly header. */ > > This form doesn't appear in RFC2292, nor in 2133 (Basic Socket...) > > My interpretation is that this latter form is defined for kernel use, > while the former is for user-level manipulation of raw packet fields > (the primary purpose of 2292). We use these in the kernel, but not in userspace. We define SA_ESP, etc... > Does it make sense to have two forms, one kernel, one user? I haven't > e.g. followed the desired include chain. If we wanted to merge the > uses, the former form and include location would probably have to be > used. We use the two forms since shared user/kernel headers are a nuisance... > I've been looking into this. There are a *few* things missing from the > 2292 support. AFAICT, it's just a handful of functions/macros for > manipulating option headers that need to be added. > > Does anybody actually USE this stuff (the advanced sockets API, I mean, > not IPv6)? I'm planning to add those missing bits, just for kicks, but > haven't done it yet. > > --Rod slainte mhath, RGB -- Richard Guy Briggs -- ~\ Auto-Free Ottawa! Canada -- \@ @ No Internet Wiretapping! -- _\\/\%___\\/\% Vote! -- _______GTVS6#790__(*)_______(*)(*)_______ From kazunori@miyazawa.org Wed Mar 5 16:32:00 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 05 Mar 2003 16:32:03 -0800 (PST) Received: from miyazawa.org (usen-43x235x12x234.ap-USEN.usen.ad.jp [43.235.12.234]) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h260Vwq9011219 for ; Wed, 5 Mar 2003 16:31:59 -0800 Received: from monza.miyazawa.org ([2001:200:0:ff18:220:e0ff:fe8a:e797]) (AUTH: LOGIN kazunori, ) by miyazawa.org with esmtp; Thu, 06 Mar 2003 09:14:01 +0900 Date: Thu, 6 Mar 2003 09:32:19 +0900 From: Kazunori Miyazawa To: "David S. Miller" Cc: kuznet@ms2.inr.ac.ru, linux-kernel@vger.kernel.org, netdev@oss.sgi.com, usagi-core@linux-ipv6.org Subject: Re: [PATH] IPv6 IPsec support Message-Id: <20030306093219.1a702868.kazunori@miyazawa.org> In-Reply-To: <20030305.152530.70806720.davem@redhat.com> References: <20030305233025.784feb00.kazunori@miyazawa.org> <20030305.152530.70806720.davem@redhat.com> X-Mailer: Sylpheed version 0.8.10 (GTK+ 1.2.10; i386-debian-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 1878 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: kazunori@miyazawa.org Precedence: bulk X-list: netdev Hello David, On Wed, 05 Mar 2003 15:25:30 -0800 (PST) "David S. Miller" wrote: > From: Kazunori Miyazawa > Date: Wed, 5 Mar 2003 23:30:25 +0900 > > Hello Miyazawa-san, > > I submit the patch to let the kernel support ipv6 ipsec again. > It is able to comple ipv6 as module. > > As promised I applied the patch. I will push it to Linus later > this evening, or tomorrow. > > In this initial checkin I made only 2 minor fixes, they > are attached below: > Thank you very much. My patch is the first step. I think there are these TODOs around IPv6 IPsec as far as I remember. - Extension Header Processing on inbound: As a result of IPv6 IPsec support, Extension Header processing is devided into ipv6_parse_exthdrs and ipproto->handler. I think it is better to merge other Extension Header handling into ipproto->handler. - Fragmentation support on outbound: We should change ipv6_build_xmit like ip_append_data style to support fragmentation with IPsec. - Removing duplicate codes, clean up and improveing performance. - Considering relation of IPv6 IPsec and Mobile IPv6. This is future stuff. Best regards, --Kazunori Miyazawa (Yokogawa Electric Corporation) From davem@redhat.com Wed Mar 5 21:02:48 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 05 Mar 2003 21:02:52 -0800 (PST) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h2652lq9015424 for ; Wed, 5 Mar 2003 21:02:48 -0800 Received: from localhost (IDENT:davem@localhost.localdomain [127.0.0.1]) by pizda.ninka.net (8.9.3/8.9.3) with ESMTP id UAA18044; Wed, 5 Mar 2003 20:43:49 -0800 Date: Wed, 05 Mar 2003 20:43:48 -0800 (PST) Message-Id: <20030305.204348.130225511.davem@redhat.com> To: kazunori@miyazawa.org Cc: kuznet@ms2.inr.ac.ru, linux-kernel@vger.kernel.org, netdev@oss.sgi.com, usagi-core@linux-ipv6.org Subject: Re: [PATH] IPv6 IPsec support From: "David S. Miller" In-Reply-To: <20030306093219.1a702868.kazunori@miyazawa.org> References: <20030305233025.784feb00.kazunori@miyazawa.org> <20030305.152530.70806720.davem@redhat.com> <20030306093219.1a702868.kazunori@miyazawa.org> X-FalunGong: Information control. X-Mailer: Mew version 2.1 on Emacs 21.1 / Mule 5.0 (SAKAKI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 1879 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev From: Kazunori Miyazawa Date: Thu, 6 Mar 2003 09:32:19 +0900 - Extension Header Processing on inbound: As a result of IPv6 IPsec support, Extension Header processing is devided into ipv6_parse_exthdrs and ipproto->handler. I think it is better to merge other Extension Header handling into ipproto->handler. Ok. - Fragmentation support on outbound: We should change ipv6_build_xmit like ip_append_data style to support fragmentation with IPsec. Please work together with Alexey on this. There are known major problems on ipv4 side, and it must be resolved before ipv6 side may be done. For example, right now a non-TCP packet can do the following. If it is just slightly smaller than MTU, and when encapsulated in ESP/AH it becomes larger than MTU, we will not fragment it and too-large frame will be sent to device. In my last round of talks with Alexey I believe we were very close to a possible solution to this problem. The idea was to have a "local dont-fragment" flag, and at the very last stage of IP output we check this and either 1) clear DF and fragment or 2) drop packet and send ICMP message back. Alexey, what is the current state? - Removing duplicate codes, clean up and improveing performance. - Considering relation of IPv6 IPsec and Mobile IPv6. This is future stuff. Ok. From rreddy@c.psc.edu Thu Mar 6 08:01:46 2003 Received: with ECARTIS (v1.0.0; list netdev); Thu, 06 Mar 2003 08:01:49 -0800 (PST) Received: from c.psc.edu (c.psc.edu [128.182.73.106]) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h26G1jq9018848 for ; Thu, 6 Mar 2003 08:01:46 -0800 Received: by c.psc.edu for NETDEV@OSS.SGI.COM; Thu, 6 Mar 2003 11:01:45 -0500 Date: Thu, 6 Mar 2003 11:01:45 -0500 From: "Raghurama 'REDDY'" Reply-To: rreddy@psc.edu To: NETDEV@OSS.SGI.COM CC: RREDDY@vms.psc.edu Message-Id: <03030611014533.2221238e.7643064@psc.edu> Subject: Output on raw sockets ignores IP_DF when packet is bigger than pmtu X-archive-position: 1881 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: rreddy@psc.edu Precedence: bulk X-list: netdev Content-Length: 1506 Lines: 46 Hello! We are looking at the "traceroute" code, in particular the behavior of the "-M" option (path MTU discovery) on a 2.4.20 kernel. In attempting to get a "Framentation required message" from an intermediate router, the code does the following: - Open a raw socket - Send a packet that is smaller than interface MTU but bigger than the MTU of an intermediate router with IP_DF set (the NIC has an MTU of 4400, and pmtu is 1500). What is observed is that when route cache is flushed, it works as expected. We get a "Framentation Required" message from the intermediate router. But when the cache is *not* flushed, the packets are fragmented based on pmtu before sending the packets out on the net, inspite of the fact that IP_DF is set (based on the tcpdump observations). Is this the right behavior? But looking at 2.4.20 kernel code in "include/net/ip.h": Not sure if "ip_send" is the in the call tree or not; I am *not* intimately familiar with the code ... :-( ------------------ static inline int ip_send(struct sk_buff *skb) { if (skb->len > skb->dst->pmtu) return ip_fragment(skb, ip_finish_output); else return ip_finish_output(skb); } ------------------ This seems to indicate that it would fragment the packet if the packet is bigger than path MTU irrespetive of the IP_DF flag. Is there a way to get the host to not fragment when IP_DF is set, iresspective what the pmtu is? Thanks! --rr From sri@us.ibm.com Thu Mar 6 10:47:07 2003 Received: with ECARTIS (v1.0.0; list netdev); Thu, 06 Mar 2003 10:47:13 -0800 (PST) Received: from e6.ny.us.ibm.com (e6.ny.us.ibm.com [32.97.182.106]) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h26Il0q9003014 for ; Thu, 6 Mar 2003 10:47:07 -0800 Received: from northrelay01.pok.ibm.com (northrelay01.pok.ibm.com [9.56.224.149]) by e6.ny.us.ibm.com (8.12.8/8.12.2) with ESMTP id h26IkrZu049886; Thu, 6 Mar 2003 13:46:53 -0500 Received: from dyn9-47-18-140.beaverton.ibm.com (dyn9-47-18-140.beaverton.ibm.com [9.47.18.140]) by northrelay01.pok.ibm.com (8.12.8/NCO/VER6.5) with ESMTP id h26IknPO062020; Thu, 6 Mar 2003 13:46:50 -0500 Date: Thu, 6 Mar 2003 10:30:33 -0800 (PST) From: Sridhar Samudrala X-X-Sender: sridhar@dyn9-47-18-140.beaverton.ibm.com To: "Raghurama 'REDDY'" cc: NETDEV@oss.sgi.com, Subject: Re: Output on raw sockets ignores IP_DF when packet is bigger than pmtu In-Reply-To: <03030611014533.2221238e.7643064@psc.edu> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-archive-position: 1882 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: sri@us.ibm.com Precedence: bulk X-list: netdev Content-Length: 345 Lines: 13 On Thu, 6 Mar 2003, Raghurama 'REDDY' wrote: > Is there a way to get the host to not fragment when IP_DF is set, > iresspective what the pmtu is? You can disable pmtu discovery on a socket using the IP level socket option IP_MTU_DISCOVER. System wide pmtu discovery can be disabled by setting /proc/sys/net/ipv4/ip_no_pmtu_disc -Sridhar From davem@redhat.com Thu Mar 6 10:50:39 2003 Received: with ECARTIS (v1.0.0; list netdev); Thu, 06 Mar 2003 10:50:43 -0800 (PST) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h26Inxq9003683 for ; Thu, 6 Mar 2003 10:50:39 -0800 Received: from localhost (IDENT:davem@localhost.localdomain [127.0.0.1]) by pizda.ninka.net (8.9.3/8.9.3) with ESMTP id KAA19604; Thu, 6 Mar 2003 10:31:43 -0800 Date: Thu, 06 Mar 2003 10:31:42 -0800 (PST) Message-Id: <20030306.103142.58817243.davem@redhat.com> To: eric@lammerts.org Cc: linux-net@vger.kernel.org, netdev@oss.sgi.com, alan@lxorguk.ukuu.org.uk Subject: Re: [PATCH] wrong ENETDOWN in af_packet? From: "David S. Miller" In-Reply-To: <20030305141123.GA16699@ally.lammerts.org> References: <20030305141123.GA16699@ally.lammerts.org> X-FalunGong: Information control. X-Mailer: Mew version 2.1 on Emacs 21.1 / Mule 5.0 (SAKAKI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 1883 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev Content-Length: 439 Lines: 10 From: Eric Lammerts Date: Wed, 5 Mar 2003 15:11:23 +0100 The reason is that (in af_packet.c) packet_notifier(NETDEV_DOWN) sets sk->err to ENETDOWN, but packet_notifier(NETDEV_UP) doesn't clear it. Is this behaviour deliberate? Yes the behavior is deliberate. You want to be aware of the event. Just because the opposite event has occurred afterwards doesn't mean the first event didn't happen :-) From holt@sgi.com Thu Mar 6 13:11:01 2003 Received: with ECARTIS (v1.0.0; list netdev); Thu, 06 Mar 2003 13:11:05 -0800 (PST) Received: from tolkor.sgi.com ([198.149.18.6]) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h26LALq9010348 for ; Thu, 6 Mar 2003 13:11:01 -0800 Received: from ledzep.americas.sgi.com (ledzep.americas.sgi.com [192.48.203.134]) by tolkor.sgi.com (8.12.2/8.12.2/linux-outbound_gateway-1.2) with ESMTP id h26LKokq007999 for ; Thu, 6 Mar 2003 15:20:50 -0600 Received: from thistle-e236.americas.sgi.com (thistle-e236.americas.sgi.com [128.162.236.204]) by ledzep.americas.sgi.com (SGI-8.9.3/americas-smart-nospam1.1) with ESMTP id PAA55938; Thu, 6 Mar 2003 15:10:14 -0600 (CST) Received: from mandrake.americas.sgi.com (mandrake.americas.sgi.com [128.162.232.96]) by thistle-e236.americas.sgi.com (8.12.8/SGI-server-1.8) with ESMTP id h26LAEvw6978040; Thu, 6 Mar 2003 15:10:15 -0600 (CST) Received: from mandrake.americas.sgi.com (localhost.localdomain [127.0.0.1]) by mandrake.americas.sgi.com (8.12.5/8.11.6/erikj-RedHat-7.2-Eagan) with ESMTP id h26LAEst031494; Thu, 6 Mar 2003 15:10:14 -0600 Received: from localhost (holt@localhost) by mandrake.americas.sgi.com (8.12.5/8.12.5/Submit) with ESMTP id h26LAEct031490; Thu, 6 Mar 2003 15:10:14 -0600 X-Authentication-Warning: mandrake.americas.sgi.com: holt owned process doing -bs Date: Thu, 6 Mar 2003 15:10:13 -0600 (CST) From: Robin Holt X-X-Sender: holt@mandrake.americas.sgi.com To: Linux Kernel Mailing List , Subject: Make ipconfig.c work as a loadable module. Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-archive-position: 1884 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: holt@sgi.com Precedence: bulk X-list: netdev Content-Length: 6072 Lines: 208 The patch at the end of this email makes ipconfig.c work as a loadable module under the 2.5. The diff was taken against the bitkeeper tree changeset 1.1075. Currently ipconfig.o must get statically linked into the kernel. I have a proprietary driver which the supplier will not provide a GPL version or info. In order to mount root over NFS, I need to get the vendors driver loaded via a ramdisk. A couple more items get moved from ipconfig.h to nfs_fs.h. Thanks, Robin Holt ------------------------- Patch --------------------------------------- ===== fs/Kconfig 1.18 vs edited ===== --- 1.18/fs/Kconfig Sun Feb 9 19:29:49 2003 +++ edited/fs/Kconfig Wed Mar 5 11:07:56 2003 @@ -1270,7 +1270,7 @@ config ROOT_NFS bool "Root file system on NFS" - depends on NFS_FS=y && IP_PNP + depends on NFS_FS=y && IP_PNP!=n help If you want your Linux box to mount its whole root file system (the one containing the directory /) from some other computer over the ===== fs/nfs/nfsroot.c 1.11 vs edited ===== --- 1.11/fs/nfs/nfsroot.c Thu Nov 7 11:29:59 2002 +++ edited/fs/nfs/nfsroot.c Wed Mar 5 11:07:56 2003 @@ -69,6 +69,7 @@ */ #include +#include #include #include #include @@ -106,6 +107,15 @@ static struct nfs_mount_data nfs_data __initdata = { 0, };/* NFS mount info */ static int nfs_port __initdata = 0; /* Port to connect to for NFS */ static int mount_port __initdata = 0; /* Mount daemon port number */ + + +u32 root_server_addr __initdata = INADDR_NONE; /* Address of NFS server */ +u8 root_server_path[NFS_ROOT_PATH_LEN] __initdata = { 0, }; /* Path to mount as root */ + +#ifdef CONFIG_IP_PNP_MODULE +EXPORT_SYMBOL(root_server_addr); +EXPORT_SYMBOL(root_server_path); +#endif /*************************************************************************** ===== include/linux/nfs_fs.h 1.43 vs edited ===== --- 1.43/include/linux/nfs_fs.h Sat Dec 21 00:29:02 2002 +++ edited/include/linux/nfs_fs.h Wed Mar 5 11:07:56 2003 @@ -417,7 +417,12 @@ /* NFS root */ +#ifdef CONFIG_ROOT_NFS +#define NFS_ROOT_PATH_LEN 256 +extern u8 root_server_path[NFS_ROOT_PATH_LEN]; /* Path to mount as root */ + extern void * nfs_root_data(void); +#endif #define nfs_wait_event(clnt, wq, condition) \ ({ \ ===== include/net/ipconfig.h 1.2 vs edited ===== --- 1.2/include/net/ipconfig.h Tue Feb 5 01:40:15 2002 +++ edited/include/net/ipconfig.h Wed Mar 5 11:07:56 2003 @@ -21,7 +21,6 @@ extern u32 ic_servaddr; /* Boot server IP address */ extern u32 root_server_addr; /* Address of NFS server */ -extern u8 root_server_path[]; /* Path to mount as root */ ===== net/ipv4/Kconfig 1.4 vs edited ===== --- 1.4/net/ipv4/Kconfig Wed Nov 13 06:52:02 2002 +++ edited/net/ipv4/Kconfig Wed Mar 5 11:07:56 2003 @@ -133,8 +133,8 @@ you may want to say Y here to speed up the routing process. config IP_PNP - bool "IP: kernel level autoconfiguration" - depends on INET + tristate "IP: kernel level autoconfiguration" + depends on INET!=n help This enables automatic configuration of IP addresses of devices and of the routing table during kernel boot, based on either information @@ -146,7 +146,7 @@ config IP_PNP_DHCP bool "IP: DHCP support" - depends on IP_PNP + depends on IP_PNP!=n ---help--- If you want your Linux box to mount its whole root file system (the one containing the directory /) from some other computer over the @@ -163,7 +163,7 @@ config IP_PNP_BOOTP bool "IP: BOOTP support" - depends on IP_PNP + depends on IP_PNP!=n ---help--- If you want your Linux box to mount its whole root file system (the one containing the directory /) from some other computer over the @@ -178,7 +178,7 @@ config IP_PNP_RARP bool "IP: RARP support" - depends on IP_PNP + depends on IP_PNP!=n help If you want your Linux box to mount its whole root file system (the one containing the directory /) from some other computer over the ===== net/ipv4/ipconfig.c 1.22 vs edited ===== --- 1.22/net/ipv4/ipconfig.c Tue Feb 18 12:38:27 2003 +++ edited/net/ipv4/ipconfig.c Wed Mar 5 11:07:56 2003 @@ -32,6 +32,7 @@ */ #include +#include #include #include #include @@ -52,6 +53,7 @@ #include #include #include +#include #include #include #include @@ -131,9 +133,6 @@ u32 ic_servaddr __initdata = INADDR_NONE; /* Boot server IP address */ -u32 root_server_addr __initdata = INADDR_NONE; /* Address of NFS server */ -u8 root_server_path[256] __initdata = { 0, }; /* Path to mount as root */ - /* Persistent data: */ int ic_proto_used; /* Protocol used, if any */ @@ -1136,6 +1135,7 @@ unsigned long jiff; #ifdef CONFIG_PROC_FS + /* >>> Need to remove this on unload!!! */ proc_net_create("pnp", 0, pnp_get_info); #endif /* CONFIG_PROC_FS */ @@ -1263,8 +1263,6 @@ return 0; } -module_init(ip_auto_config); - /* * Decode any IP configuration options in the "ip=" or "nfsaddrs=" kernel @@ -1386,6 +1384,29 @@ return 1; } + +#ifdef CONFIG_IP_PNP_MODULE +char *ip = NULL; +MODULE_LICENSE("GPL"); +MODULE_AUTHOR("Martin Mares"); +MODULE_DESCRIPTION("IP Autoconfig module: \n" \ + "Uses BOOTP/DHCP/RARP to determine IP configuration before the root\n" + " filesystem is mounted. See nfsroot.txt in the kernel source."); +MODULE_PARM(ip, "s"); +MODULE_PARM_DESC(ip, "[[]:[]:[]:[]:[]:[]:]"); + + +int __init init_module(void) +{ + if (ip != NULL) { + ip_auto_config_setup(ip); + } + + return ip_auto_config(); +} +#else +module_init(ip_auto_config); + static int __init nfsaddrs_config_setup(char *addrs) { return ip_auto_config_setup(addrs); @@ -1393,3 +1414,4 @@ __setup("ip=", ip_auto_config_setup); __setup("nfsaddrs=", nfsaddrs_config_setup); +#endif From alan@lxorguk.ukuu.org.uk Thu Mar 6 13:28:56 2003 Received: with ECARTIS (v1.0.0; list netdev); Thu, 06 Mar 2003 13:28:59 -0800 (PST) Received: from irongate.swansea.linux.org.uk (pc2-cwma1-4-cust86.swan.cable.ntl.com [213.105.254.86]) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h26LSsq9012485 for ; Thu, 6 Mar 2003 13:28:56 -0800 Received: from irongate.swansea.linux.org.uk (localhost [127.0.0.1]) by irongate.swansea.linux.org.uk (8.12.7/8.12.7) with ESMTP id h26MYHYf018976; Thu, 6 Mar 2003 22:34:17 GMT Received: (from alan@localhost) by irongate.swansea.linux.org.uk (8.12.7/8.12.7/Submit) id h26MYG6T018974; Thu, 6 Mar 2003 22:34:16 GMT X-Authentication-Warning: irongate.swansea.linux.org.uk: alan set sender to alan@lxorguk.ukuu.org.uk using -f Subject: Re: Make ipconfig.c work as a loadable module. From: Alan Cox To: Robin Holt Cc: Linux Kernel Mailing List , netdev@oss.sgi.com In-Reply-To: References: Content-Type: text/plain Content-Transfer-Encoding: 7bit Organization: Message-Id: <1046990052.18158.121.camel@irongate.swansea.linux.org.uk> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.2.1 (1.2.1-4) Date: 06 Mar 2003 22:34:16 +0000 X-archive-position: 1885 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: alan@lxorguk.ukuu.org.uk Precedence: bulk X-list: netdev Content-Length: 619 Lines: 14 On Thu, 2003-03-06 at 21:10, Robin Holt wrote: > The patch at the end of this email makes ipconfig.c work as a loadable > module under the 2.5. The diff was taken against the bitkeeper tree > changeset 1.1075. The right fix is to delete ipconfig.c, it has been the right fix for a long long time. There are initrd based bootp/dhcp setups that can also then mount a root NFS partition and they do *not* need any kernel helper. Indeed probably the biggest distro using nfs root (LTSP) doesn't use ipconfig even on 2.4. DaveM can you just remove the thing. See http://www.ltsp.org for initrds that don't need it in From cw@f00f.org Thu Mar 6 13:32:18 2003 Received: with ECARTIS (v1.0.0; list netdev); Thu, 06 Mar 2003 13:32:20 -0800 (PST) Received: from tapu.f00f.org (tapu.f00f.org [202.49.232.129]) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h26LWHq9013327 for ; Thu, 6 Mar 2003 13:32:18 -0800 Received: by tapu.f00f.org (Postfix, from userid 10000) id C73351830E0C; Thu, 6 Mar 2003 13:32:17 -0800 (PST) Date: Thu, 6 Mar 2003 13:32:17 -0800 From: Chris Wedgwood To: "David S. Miller" Cc: yoshfuji@linux-ipv6.org, kazunori@miyazawa.org, kuznet@ms2.inr.ac.ru, linux-kernel@vger.kernel.org, netdev@oss.sgi.com, usagi@linux-ipv6.org Subject: Re: (usagi-core 12294) Re: [PATCH] IPv6 IPsec support Message-ID: <20030306213217.GA6358@f00f.org> References: <20030305233025.784feb00.kazunori@miyazawa.org> <20030305.072149.121185037.davem@redhat.com> <20030306.004820.41101302.yoshfuji@linux-ipv6.org> <20030305.154100.28816301.davem@redhat.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20030305.154100.28816301.davem@redhat.com> User-Agent: Mutt/1.3.28i X-No-Archive: Yes X-archive-position: 1886 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: cw@f00f.org Precedence: bulk X-list: netdev Content-Length: 406 Lines: 12 On Wed, Mar 05, 2003 at 03:41:00PM -0800, David S. Miller wrote: > Note that this coincides with the idea to eventually have an > address-family independant flow cache. Actually... at that point being able to monitor updates to the flow-cache would be useful for various statistical purposes and applications, especially if the flow cache was able to periodically export utilization counters... --cw From garzik@gtf.org Thu Mar 6 14:11:43 2003 Received: with ECARTIS (v1.0.0; list netdev); Thu, 06 Mar 2003 14:11:46 -0800 (PST) Received: from havoc.gtf.org (havoc.daloft.com [64.213.145.173]) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h26MBgq9016737 for ; Thu, 6 Mar 2003 14:11:43 -0800 Received: by havoc.gtf.org (Postfix, from userid 500) id E3A7D6659; Thu, 6 Mar 2003 17:11:36 -0500 (EST) Date: Thu, 6 Mar 2003 17:11:36 -0500 From: Jeff Garzik To: Alan Cox Cc: Robin Holt , Linux Kernel Mailing List , netdev@oss.sgi.com Subject: Re: Make ipconfig.c work as a loadable module. Message-ID: <20030306221136.GB26732@gtf.org> References: <1046990052.18158.121.camel@irongate.swansea.linux.org.uk> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1046990052.18158.121.camel@irongate.swansea.linux.org.uk> User-Agent: Mutt/1.3.28i X-archive-position: 1887 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jgarzik@pobox.com Precedence: bulk X-list: netdev Content-Length: 675 Lines: 19 On Thu, Mar 06, 2003 at 10:34:16PM +0000, Alan Cox wrote: > On Thu, 2003-03-06 at 21:10, Robin Holt wrote: > > The patch at the end of this email makes ipconfig.c work as a loadable > > module under the 2.5. The diff was taken against the bitkeeper tree > > changeset 1.1075. > > The right fix is to delete ipconfig.c, it has been the right fix for a long > long time. There are initrd based bootp/dhcp setups that can also then mount > a root NFS partition and they do *not* need any kernel helper. The klibc tarball on kernel.org also has ipconfig-type code, waiting for initramfs early userspace :) Many have wanted to delete ipconfig.c for a while now... Jeff From rmk@arm.linux.org.uk Thu Mar 6 14:25:58 2003 Received: with ECARTIS (v1.0.0; list netdev); Thu, 06 Mar 2003 14:26:02 -0800 (PST) Received: from caramon.arm.linux.org.uk (caramon.arm.linux.org.uk [212.18.232.186]) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h26MPuq9018868 for ; Thu, 6 Mar 2003 14:25:58 -0800 Received: from flint.arm.linux.org.uk ([3ffe:8260:2002:1:201:2ff:fe14:8fad]) by caramon.arm.linux.org.uk with asmtp (TLSv1:DES-CBC3-SHA:168) (Exim 4.12) id 18r3oB-0000Oy-00; Thu, 06 Mar 2003 22:25:47 +0000 Received: from rmk by flint.arm.linux.org.uk with local (Exim 4.12) id 18r3oA-0005rD-00; Thu, 06 Mar 2003 22:25:46 +0000 Date: Thu, 6 Mar 2003 22:25:46 +0000 From: Russell King To: Jeff Garzik Cc: Alan Cox , Robin Holt , Linux Kernel Mailing List , netdev@oss.sgi.com Subject: Re: Make ipconfig.c work as a loadable module. Message-ID: <20030306222546.K838@flint.arm.linux.org.uk> Mail-Followup-To: Jeff Garzik , Alan Cox , Robin Holt , Linux Kernel Mailing List , netdev@oss.sgi.com References: <1046990052.18158.121.camel@irongate.swansea.linux.org.uk> <20030306221136.GB26732@gtf.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.2.5.1i In-Reply-To: <20030306221136.GB26732@gtf.org>; from jgarzik@pobox.com on Thu, Mar 06, 2003 at 05:11:36PM -0500 X-archive-position: 1888 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: rmk@arm.linux.org.uk Precedence: bulk X-list: netdev Content-Length: 1036 Lines: 23 On Thu, Mar 06, 2003 at 05:11:36PM -0500, Jeff Garzik wrote: > On Thu, Mar 06, 2003 at 10:34:16PM +0000, Alan Cox wrote: > > On Thu, 2003-03-06 at 21:10, Robin Holt wrote: > > > The patch at the end of this email makes ipconfig.c work as a loadable > > > module under the 2.5. The diff was taken against the bitkeeper tree > > > changeset 1.1075. > > > > The right fix is to delete ipconfig.c, it has been the right fix for a long > > long time. There are initrd based bootp/dhcp setups that can also then mount > > a root NFS partition and they do *not* need any kernel helper. > > The klibc tarball on kernel.org also has ipconfig-type code, waiting for > initramfs early userspace :) > > Many have wanted to delete ipconfig.c for a while now... Yep, can't the deletion wait a couple more weeks or so until klibc gets merged? It's not like ipconfig.c is broken currently, is it? -- Russell King (rmk@arm.linux.org.uk) The developer of ARM Linux http://www.arm.linux.org.uk/personal/aboutme.html From garzik@gtf.org Thu Mar 6 14:32:22 2003 Received: with ECARTIS (v1.0.0; list netdev); Thu, 06 Mar 2003 14:32:26 -0800 (PST) Received: from havoc.gtf.org (havoc.daloft.com [64.213.145.173]) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h26MWMq9019833 for ; Thu, 6 Mar 2003 14:32:22 -0800 Received: by havoc.gtf.org (Postfix, from userid 500) id D3D836659; Thu, 6 Mar 2003 17:32:16 -0500 (EST) Date: Thu, 6 Mar 2003 17:32:16 -0500 From: Jeff Garzik To: Alan Cox , Robin Holt , Linux Kernel Mailing List , netdev@oss.sgi.com Subject: Re: Make ipconfig.c work as a loadable module. Message-ID: <20030306223216.GB28643@gtf.org> References: <1046990052.18158.121.camel@irongate.swansea.linux.org.uk> <20030306221136.GB26732@gtf.org> <20030306222546.K838@flint.arm.linux.org.uk> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20030306222546.K838@flint.arm.linux.org.uk> User-Agent: Mutt/1.3.28i X-archive-position: 1889 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jgarzik@pobox.com Precedence: bulk X-list: netdev Content-Length: 296 Lines: 12 On Thu, Mar 06, 2003 at 10:25:46PM +0000, Russell King wrote: > Yep, can't the deletion wait a couple more weeks or so until klibc gets > merged? It's not like ipconfig.c is broken currently, is it? The klibc merge date appears to be infinity at this point. Probably my fault, too. Jeff From alan@lxorguk.ukuu.org.uk Thu Mar 6 15:09:42 2003 Received: with ECARTIS (v1.0.0; list netdev); Thu, 06 Mar 2003 15:09:45 -0800 (PST) Received: from irongate.swansea.linux.org.uk (pc2-cwma1-4-cust86.swan.cable.ntl.com [213.105.254.86]) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h26N8sq9021445 for ; Thu, 6 Mar 2003 15:09:41 -0800 Received: from irongate.swansea.linux.org.uk (localhost [127.0.0.1]) by irongate.swansea.linux.org.uk (8.12.7/8.12.7) with ESMTP id h270E1Yf019240; Fri, 7 Mar 2003 00:14:02 GMT Received: (from alan@localhost) by irongate.swansea.linux.org.uk (8.12.7/8.12.7/Submit) id h270Dw3q019238; Fri, 7 Mar 2003 00:13:58 GMT X-Authentication-Warning: irongate.swansea.linux.org.uk: alan set sender to alan@lxorguk.ukuu.org.uk using -f Subject: Re: Make ipconfig.c work as a loadable module. From: Alan Cox To: Russell King Cc: Jeff Garzik , Robin Holt , Linux Kernel Mailing List , netdev@oss.sgi.com In-Reply-To: <20030306222546.K838@flint.arm.linux.org.uk> References: <1046990052.18158.121.camel@irongate.swansea.linux.org.uk> <20030306221136.GB26732@gtf.org> <20030306222546.K838@flint.arm.linux.org.uk> Content-Type: text/plain Content-Transfer-Encoding: 7bit Organization: Message-Id: <1046996037.18158.142.camel@irongate.swansea.linux.org.uk> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.2.1 (1.2.1-4) Date: 07 Mar 2003 00:13:57 +0000 X-archive-position: 1890 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: alan@lxorguk.ukuu.org.uk Precedence: bulk X-list: netdev Content-Length: 751 Lines: 18 On Thu, 2003-03-06 at 22:25, Russell King wrote: > > > The right fix is to delete ipconfig.c, it has been the right fix for a long > > > long time. There are initrd based bootp/dhcp setups that can also then mount > > > a root NFS partition and they do *not* need any kernel helper. > > > > The klibc tarball on kernel.org also has ipconfig-type code, waiting for > > initramfs early userspace :) > > > > Many have wanted to delete ipconfig.c for a while now... > > Yep, can't the deletion wait a couple more weeks or so until klibc gets > merged? It's not like ipconfig.c is broken currently, is it? Thats how it ended up in 2.4. Klibc doesnt really matter, the apps exist linked with dietlibc and stuff even without klibc. Time for it to die From rmk@arm.linux.org.uk Thu Mar 6 15:19:16 2003 Received: with ECARTIS (v1.0.0; list netdev); Thu, 06 Mar 2003 15:19:18 -0800 (PST) Received: from caramon.arm.linux.org.uk (caramon.arm.linux.org.uk [212.18.232.186]) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h26NJEq9022298 for ; Thu, 6 Mar 2003 15:19:15 -0800 Received: from flint.arm.linux.org.uk ([3ffe:8260:2002:1:201:2ff:fe14:8fad]) by caramon.arm.linux.org.uk with asmtp (TLSv1:DES-CBC3-SHA:168) (Exim 4.12) id 18r4dn-0000bx-00; Thu, 06 Mar 2003 23:19:07 +0000 Received: from rmk by flint.arm.linux.org.uk with local (Exim 4.12) id 18r4dl-0006LP-00; Thu, 06 Mar 2003 23:19:05 +0000 Date: Thu, 6 Mar 2003 23:19:05 +0000 From: Russell King To: Alan Cox Cc: Jeff Garzik , Robin Holt , Linux Kernel Mailing List , netdev@oss.sgi.com Subject: Re: Make ipconfig.c work as a loadable module. Message-ID: <20030306231905.M838@flint.arm.linux.org.uk> Mail-Followup-To: Alan Cox , Jeff Garzik , Robin Holt , Linux Kernel Mailing List , netdev@oss.sgi.com References: <1046990052.18158.121.camel@irongate.swansea.linux.org.uk> <20030306221136.GB26732@gtf.org> <20030306222546.K838@flint.arm.linux.org.uk> <1046996037.18158.142.camel@irongate.swansea.linux.org.uk> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.2.5.1i In-Reply-To: <1046996037.18158.142.camel@irongate.swansea.linux.org.uk>; from alan@lxorguk.ukuu.org.uk on Fri, Mar 07, 2003 at 12:13:57AM +0000 X-archive-position: 1891 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: rmk@arm.linux.org.uk Precedence: bulk X-list: netdev Content-Length: 1295 Lines: 32 On Fri, Mar 07, 2003 at 12:13:57AM +0000, Alan Cox wrote: > On Thu, 2003-03-06 at 22:25, Russell King wrote: > > > > The right fix is to delete ipconfig.c, it has been the right fix for a long > > > > long time. There are initrd based bootp/dhcp setups that can also then mount > > > > a root NFS partition and they do *not* need any kernel helper. > > > > > > The klibc tarball on kernel.org also has ipconfig-type code, waiting for > > > initramfs early userspace :) > > > > > > Many have wanted to delete ipconfig.c for a while now... > > > > Yep, can't the deletion wait a couple more weeks or so until klibc gets > > merged? It's not like ipconfig.c is broken currently, is it? > > Thats how it ended up in 2.4. Klibc doesnt really matter, the apps exist > linked with dietlibc and stuff even without klibc. > > Time for it to die "klibc doesnt really matter" I'd prefer not to have to have thousands of special programs around just to be able to boot my machines, especially when it was all in- kernel up until this point. klibc yes, dietlibc with random other garbage in some random filesystem which'd need maintaining - no thanks. -- Russell King (rmk@arm.linux.org.uk) The developer of ARM Linux http://www.arm.linux.org.uk/personal/aboutme.html From alan@lxorguk.ukuu.org.uk Thu Mar 6 15:24:30 2003 Received: with ECARTIS (v1.0.0; list netdev); Thu, 06 Mar 2003 15:24:32 -0800 (PST) Received: from irongate.swansea.linux.org.uk (pc2-cwma1-4-cust86.swan.cable.ntl.com [213.105.254.86]) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h26NOSq9022738 for ; Thu, 6 Mar 2003 15:24:29 -0800 Received: from irongate.swansea.linux.org.uk (localhost [127.0.0.1]) by irongate.swansea.linux.org.uk (8.12.7/8.12.7) with ESMTP id h270ToYf019275; Fri, 7 Mar 2003 00:29:51 GMT Received: (from alan@localhost) by irongate.swansea.linux.org.uk (8.12.7/8.12.7/Submit) id h270TmNx019273; Fri, 7 Mar 2003 00:29:48 GMT X-Authentication-Warning: irongate.swansea.linux.org.uk: alan set sender to alan@lxorguk.ukuu.org.uk using -f Subject: Re: Make ipconfig.c work as a loadable module. From: Alan Cox To: Russell King Cc: Jeff Garzik , Robin Holt , Linux Kernel Mailing List , netdev@oss.sgi.com In-Reply-To: <20030306231905.M838@flint.arm.linux.org.uk> References: <1046990052.18158.121.camel@irongate.swansea.linux.org.uk> <20030306221136.GB26732@gtf.org> <20030306222546.K838@flint.arm.linux.org.uk> <1046996037.18158.142.camel@irongate.swansea.linux.org.uk> <20030306231905.M838@flint.arm.linux.org.uk> Content-Type: text/plain Content-Transfer-Encoding: 7bit Organization: Message-Id: <1046996987.17718.144.camel@irongate.swansea.linux.org.uk> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.2.1 (1.2.1-4) Date: 07 Mar 2003 00:29:47 +0000 X-archive-position: 1892 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: alan@lxorguk.ukuu.org.uk Precedence: bulk X-list: netdev Content-Length: 599 Lines: 14 On Thu, 2003-03-06 at 23:19, Russell King wrote: > "klibc doesnt really matter" > > I'd prefer not to have to have thousands of special programs around > just to be able to boot my machines, especially when it was all in- > kernel up until this point. > > klibc yes, dietlibc with random other garbage in some random filesystem > which'd need maintaining - no thanks. You can build the dhcp client with glibc static into your initrd. Its hardly magic or special programs or random garbage, and last time I counted it came to one program. Dunno what the other 999 utilities your dhcp needs are ? From davem@redhat.com Thu Mar 6 15:45:56 2003 Received: with ECARTIS (v1.0.0; list netdev); Thu, 06 Mar 2003 15:46:00 -0800 (PST) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h26Njtq9023628 for ; Thu, 6 Mar 2003 15:45:56 -0800 Received: from localhost (IDENT:davem@localhost.localdomain [127.0.0.1]) by pizda.ninka.net (8.9.3/8.9.3) with ESMTP id PAA20292; Thu, 6 Mar 2003 15:27:04 -0800 Date: Thu, 06 Mar 2003 15:27:03 -0800 (PST) Message-Id: <20030306.152703.21845381.davem@redhat.com> To: cw@f00f.org Cc: yoshfuji@linux-ipv6.org, kazunori@miyazawa.org, kuznet@ms2.inr.ac.ru, linux-kernel@vger.kernel.org, netdev@oss.sgi.com, usagi@linux-ipv6.org Subject: Re: (usagi-core 12294) Re: [PATCH] IPv6 IPsec support From: "David S. Miller" In-Reply-To: <20030306213217.GA6358@f00f.org> References: <20030306.004820.41101302.yoshfuji@linux-ipv6.org> <20030305.154100.28816301.davem@redhat.com> <20030306213217.GA6358@f00f.org> X-FalunGong: Information control. X-Mailer: Mew version 2.1 on Emacs 21.1 / Mule 5.0 (SAKAKI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 1893 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev Content-Length: 395 Lines: 11 From: Chris Wedgwood Date: Thu, 6 Mar 2003 13:32:17 -0800 Actually... at that point being able to monitor updates to the flow-cache would be useful for various statistical purposes and applications, especially if the flow cache was able to periodically export utilization counters... It will keep statistics, just like the route cache keeps them now. From rmk@arm.linux.org.uk Thu Mar 6 16:09:06 2003 Received: with ECARTIS (v1.0.0; list netdev); Thu, 06 Mar 2003 16:09:10 -0800 (PST) Received: from caramon.arm.linux.org.uk (caramon.arm.linux.org.uk [212.18.232.186]) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h2708Pq9024526 for ; Thu, 6 Mar 2003 16:09:06 -0800 Received: from flint.arm.linux.org.uk ([3ffe:8260:2002:1:201:2ff:fe14:8fad]) by caramon.arm.linux.org.uk with asmtp (TLSv1:DES-CBC3-SHA:168) (Exim 4.12) id 18r5PN-0000oC-00; Fri, 07 Mar 2003 00:08:17 +0000 Received: from rmk by flint.arm.linux.org.uk with local (Exim 4.12) id 18r5PM-0006qL-00; Fri, 07 Mar 2003 00:08:16 +0000 Date: Fri, 7 Mar 2003 00:08:16 +0000 From: Russell King To: Alan Cox Cc: Jeff Garzik , Robin Holt , Linux Kernel Mailing List , netdev@oss.sgi.com Subject: Re: Make ipconfig.c work as a loadable module. Message-ID: <20030307000816.P838@flint.arm.linux.org.uk> Mail-Followup-To: Alan Cox , Jeff Garzik , Robin Holt , Linux Kernel Mailing List , netdev@oss.sgi.com References: <1046990052.18158.121.camel@irongate.swansea.linux.org.uk> <20030306221136.GB26732@gtf.org> <20030306222546.K838@flint.arm.linux.org.uk> <1046996037.18158.142.camel@irongate.swansea.linux.org.uk> <20030306231905.M838@flint.arm.linux.org.uk> <1046996987.17718.144.camel@irongate.swansea.linux.org.uk> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.2.5.1i In-Reply-To: <1046996987.17718.144.camel@irongate.swansea.linux.org.uk>; from alan@lxorguk.ukuu.org.uk on Fri, Mar 07, 2003 at 12:29:47AM +0000 X-archive-position: 1894 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: rmk@arm.linux.org.uk Precedence: bulk X-list: netdev Content-Length: 1284 Lines: 29 On Fri, Mar 07, 2003 at 12:29:47AM +0000, Alan Cox wrote: > On Thu, 2003-03-06 at 23:19, Russell King wrote: > > "klibc doesnt really matter" > > > > I'd prefer not to have to have thousands of special programs around > > just to be able to boot my machines, especially when it was all in- > > kernel up until this point. > > > > klibc yes, dietlibc with random other garbage in some random filesystem > > which'd need maintaining - no thanks. > > You can build the dhcp client with glibc static into your initrd. Its hardly > magic or special programs or random garbage, and last time I counted it came > to one program. Dunno what the other 999 utilities your dhcp needs are ? How about mount for nfs-root, a shell and a shell script to supply the correct parameters to mount so it doesn't go and try to mount the nfs-root with locking enabled - oh, and a few programs like sed and so forth to pull the mount parameters out of the dhcp client output, if there is such an output. ipconfig.c does more than just configure networking. It's a far smaller solution to NFS-root than any userspace implementation could ever hope to be. -- Russell King (rmk@arm.linux.org.uk) The developer of ARM Linux http://www.arm.linux.org.uk/personal/aboutme.html From pakrat@www.linux.org.uk Thu Mar 6 17:29:09 2003 Received: with ECARTIS (v1.0.0; list netdev); Thu, 06 Mar 2003 17:29:14 -0800 (PST) Received: from www.linux.org.uk (IDENT:exim@parcelfarce.linux.theplanet.co.uk [195.92.249.252]) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h271T7q9002297 for ; Thu, 6 Mar 2003 17:29:08 -0800 Received: from pakrat by www.linux.org.uk with local (Exim 3.33 #5) id 18r6fZ-0001IZ-00; Fri, 07 Mar 2003 01:29:05 +0000 Date: Fri, 7 Mar 2003 01:29:05 +0000 From: Chris Dukes To: Alan Cox , Jeff Garzik , Robin Holt , Linux Kernel Mailing List , netdev@oss.sgi.com Subject: Re: Make ipconfig.c work as a loadable module. Message-ID: <20030307012905.G20725@parcelfarce.linux.theplanet.co.uk> References: <1046990052.18158.121.camel@irongate.swansea.linux.org.uk> <20030306221136.GB26732@gtf.org> <20030306222546.K838@flint.arm.linux.org.uk> <1046996037.18158.142.camel@irongate.swansea.linux.org.uk> <20030306231905.M838@flint.arm.linux.org.uk> <1046996987.17718.144.camel@irongate.swansea.linux.org.uk> <20030307000816.P838@flint.arm.linux.org.uk> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.2.5.1i In-Reply-To: <20030307000816.P838@flint.arm.linux.org.uk>; from rmk@arm.linux.org.uk on Fri, Mar 07, 2003 at 12:08:16AM +0000 X-archive-position: 1895 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: pakrat@www.uk.linux.org Precedence: bulk X-list: netdev Content-Length: 1248 Lines: 28 On Fri, Mar 07, 2003 at 12:08:16AM +0000, Russell King wrote: > > > > You can build the dhcp client with glibc static into your initrd. Its hardly > > magic or special programs or random garbage, and last time I counted it came > > to one program. Dunno what the other 999 utilities your dhcp needs are ? > > How about mount for nfs-root, a shell and a shell script to supply the > correct parameters to mount so it doesn't go and try to mount the > nfs-root with locking enabled - oh, and a few programs like sed and > so forth to pull the mount parameters out of the dhcp client output, > if there is such an output. If IBM can fit a kernel and a ramdisk containing all the utilities you describe and more in smaller than 5M of file for tftp, one would think that it could be done on Linux. > > ipconfig.c does more than just configure networking. It's a far smaller > solution to NFS-root than any userspace implementation could ever hope > to be. That's nice. Would you mind explaining to us where that would be a benefit? Aside from dead header space in elf executables, I'm at a loss as to how a usermode implementation must be significantly larger than kernel code. -- Chris Dukes I tried being reasonable once--I didn't like it. From cfriesen@nortelnetworks.com Thu Mar 6 21:48:31 2003 Received: with ECARTIS (v1.0.0; list netdev); Thu, 06 Mar 2003 21:48:37 -0800 (PST) Received: from zcars04f.nortelnetworks.com (zcars04f.nortelnetworks.com [47.129.242.57]) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h275mUq9027567 for ; Thu, 6 Mar 2003 21:48:31 -0800 Received: from zcard309.ca.nortel.com (zcard309.ca.nortel.com [47.129.242.69]) by zcars04f.nortelnetworks.com (Switch-2.2.5/Switch-2.2.0) with ESMTP id h275mMj19044; Fri, 7 Mar 2003 00:48:22 -0500 (EST) Received: from zcard0k6.ca.nortel.com ([47.129.242.158]) by zcard309.ca.nortel.com with SMTP (Microsoft Exchange Internet Mail Service Version 5.5.2653.13) id GDF44DJG; Fri, 7 Mar 2003 00:48:23 -0500 Received: from pcard0ks.ca.nortel.com ([47.129.117.131]) by zcard0k6.ca.nortel.com with SMTP (Microsoft Exchange Internet Mail Service Version 5.5.2653.13) id FSL7ZCDR; Fri, 7 Mar 2003 00:48:23 -0500 Received: from nortelnetworks.com (localhost.localdomain [127.0.0.1]) by pcard0ks.ca.nortel.com (Postfix) with ESMTP id 27AFA2D957; Fri, 7 Mar 2003 00:48:22 -0500 (EST) Message-ID: <3E6832A5.2020502@nortelnetworks.com> Date: Fri, 07 Mar 2003 00:48:21 -0500 X-Sybari-Space: 00000000 00000000 00000000 From: Chris Friesen User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:0.9.8) Gecko/20020204 X-Accept-Language: en-us MIME-Version: 1.0 To: linux-kernel , linux-net@vger.kernel.org, netdev@oss.sgi.com Subject: unix socket latency regression from 2.4 to 2.5 (and multicast AF_UNIX benchmarks) Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 1896 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: cfriesen@nortelnetworks.com Precedence: bulk X-list: netdev Content-Length: 3220 Lines: 64 I've done another series of benchmarks with regards to multicast AF_UNIX on the 2.5.63 kernel. One of the biggest surprises to me was the performance regression in the normal userspace case and in the kernel case with samll numbers of listeners. The tests basically work by having a single sender send messages of three different sizes to varying numbers of listeners, which take a timestamp and then nanosleep() for a second to allow the other listeners to wake up as fast as possible. When the listeners wake up they figure out the latency and send it on to a third utility which dumps out the latencies. In the case of the userspace test, the sender sends to each listener in turn. In the case of the multicast test, the kernel handles the cloning of the packet to distribute it to the listeners. Here are the new results combined with the old for comparison. The machine is a Duron 750, with K7 optimizations in the kernel. Both kernels were compiled with gcc 3.2. 44bytes 2.4.20 2.5.63 2.4.20 2.5.63 # listeners userspace userspace kernelspace kernelspace 10 73,335 96,493 103,252 100,286 20 72,610 99,885 106,429 134,517 50 74,1482 95,2075 205,1301 230,1273 100 76,3000 97,4173 362,3425 431,2654 200 107,8719 737,9917 831,5412 236bytes 2.4.20 2.5.63 2.4.20 2.5.63 # listeners userspace userspace kernelspace kernelspace 10 70,346 98,510 81,265 100,290 20 74,639 100,918 122,468 137,533 50 75,1557 103,2225 230,1421 238,1329 100 80,3107 105,4415 408,3743 461,2794 200 131,9117 889,5720 40036-bytes 2.4.20 2.5.63 2.4.20 2.5.63 # listeners userspace userspace kernelspace kernelspace 10 302,4181 841,6218 322,1692 702,2231 20 303,7491 873,12606 347,3450 722,3829 50 306,10451 868,38031 483,8394 884,8583 100 309,23107 881,69403 697,17061 1137,16729 200 313,45528 898,132887 997,39810 1586,32722 It appears that sending/receiving is significantly more expensive in 2.5 than it was in 2.4, with the difference going up as the size of the message goes up. Is 2.5 using different copying code or something? Anyone have any ideas as to what is going on here? Also, even with the increased copying costs, the O(1) scheduler in 2.5 means that the kernelspace multicast solution is faster than the userspace solution in either kernel in all cases, even when waking up 200 listeners simultaneously. Any comments on the multicast concept that haven't been discussed already? Chris -- Chris Friesen | MailStop: 043/33/F10 Nortel Networks | work: (613) 765-0557 3500 Carling Avenue | fax: (613) 765-2986 Nepean, ON K2H 8E9 Canada | email: cfriesen@nortelnetworks.com From malware@t-online.de Thu Mar 6 23:15:55 2003 Received: with ECARTIS (v1.0.0; list netdev); Thu, 06 Mar 2003 23:16:03 -0800 (PST) Received: from mailout01.sul.t-online.com (mailout01.sul.t-online.com [194.25.134.80]) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h277Frq9032746 for ; Thu, 6 Mar 2003 23:15:55 -0800 Received: from fwd01.sul.t-online.de by mailout01.sul.t-online.com with smtp id 18rC59-0004vI-0H; Fri, 07 Mar 2003 08:15:51 +0100 Received: from fire.malware.de (320008702754-0001@[217.0.134.77]) by fwd01.sul.t-online.com with esmtp id 18rC51-1y3EsiC; Fri, 7 Mar 2003 08:15:43 +0100 Message-Id: <200303070715.IAA27138@fire.malware.de> Date: Fri, 07 Mar 2003 08:15:20 +0100 From: malware@t-online.de (Michael Mueller) X-Mailer: Mozilla 4.78 [en] (X11; U; Linux 2.4.20-pre8 i686) X-Accept-Language: en, de MIME-Version: 1.0 To: Alan Cox CC: Russell King , Jeff Garzik , Robin Holt , Linux Kernel Mailing List , netdev@oss.sgi.com Subject: Re: Make ipconfig.c work as a loadable module. References: <1046990052.18158.121.camel@irongate.swansea.linux.org.uk> <20030306221136.GB26732@gtf.org> <20030306222546.K838@flint.arm.linux.org.uk> <1046996037.18158.142.camel@irongate.swansea.linux.org.uk> <20030306231905.M838@flint.arm.linux.org.uk> <1046996987.17718.144.camel@irongate.swansea.linux.org.uk> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-Sender: 320008702754-0001@t-dialin.net X-archive-position: 1897 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: malware@t-online.de Precedence: bulk X-list: netdev Content-Length: 925 Lines: 23 Hi Alan, you wrote: > > I'd prefer not to have to have thousands of special programs around > > just to be able to boot my machines, especially when it was all in- > > kernel up until this point. > > > > klibc yes, dietlibc with random other garbage in some random filesystem > > which'd need maintaining - no thanks. > > You can build the dhcp client with glibc static into your initrd. Its hardly > magic or special programs or random garbage, and last time I counted it came > to one program. Dunno what the other 999 utilities your dhcp needs are ? Sorry, but I must join Russel here. I have atleast one machine which has a bootloader able to load exactly one file only. There is currently no way to load an initrd. It would need to implement the whole (BOOTP+)TFTP stuff again, just to get the initrd. So I was quite happy linux 2.4 still knows about mounting a NFS root filesystem without user-space help. Michael From vda@port.imtp.ilyichevsk.odessa.ua Fri Mar 7 01:24:59 2003 Received: with ECARTIS (v1.0.0; list netdev); Fri, 07 Mar 2003 01:25:03 -0800 (PST) Received: from Port.imtp.ilyichevsk.odessa.ua (169.imtp.Ilyichevsk.Odessa.UA [195.66.192.169] (may be forged)) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h279O5q9010501 for ; Fri, 7 Mar 2003 01:24:56 -0800 Received: from there ([172.16.42.177]) by Port.imtp.ilyichevsk.odessa.ua (8.10.2/8.10.2) with SMTP id h279Cpu07949; Fri, 7 Mar 2003 11:13:04 +0200 Message-Id: <200303070913.h279Cpu07949@Port.imtp.ilyichevsk.odessa.ua> Content-Type: text/plain; charset="koi8-r" From: Denis Vlasenko Reply-To: vda@port.imtp.ilyichevsk.odessa.ua To: Alan Cox , Russell King Subject: Re: Make ipconfig.c work as a loadable module. Date: Fri, 7 Mar 2003 11:10:15 +0200 X-Mailer: KMail [version 1.3.2] Cc: Jeff Garzik , Robin Holt , Linux Kernel Mailing List , netdev@oss.sgi.com References: <20030306231905.M838@flint.arm.linux.org.uk> <1046996987.17718.144.camel@irongate.swansea.linux.org.uk> In-Reply-To: <1046996987.17718.144.camel@irongate.swansea.linux.org.uk> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-archive-position: 1898 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: vda@port.imtp.ilyichevsk.odessa.ua Precedence: bulk X-list: netdev Content-Length: 558 Lines: 16 On 7 March 2003 02:29, Alan Cox wrote: > On Thu, 2003-03-06 at 23:19, Russell King wrote: > > "klibc doesnt really matter" > > > > I'd prefer not to have to have thousands of special programs around > > just to be able to boot my machines, especially when it was all in- > > kernel up until this point. > > > > klibc yes, dietlibc with random other garbage in some random > > filesystem which'd need maintaining - no thanks. > > You can build the dhcp client with glibc static into your initrd. Anything built static against glibs tends to be 400K+. -- vda From rmk@arm.linux.org.uk Fri Mar 7 01:43:25 2003 Received: with ECARTIS (v1.0.0; list netdev); Fri, 07 Mar 2003 01:43:29 -0800 (PST) Received: from caramon.arm.linux.org.uk (caramon.arm.linux.org.uk [212.18.232.186]) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h279giq9027483 for ; Fri, 7 Mar 2003 01:43:25 -0800 Received: from flint.arm.linux.org.uk ([3ffe:8260:2002:1:201:2ff:fe14:8fad]) by caramon.arm.linux.org.uk with asmtp (TLSv1:DES-CBC3-SHA:168) (Exim 4.12) id 18rENB-0002o8-00; Fri, 07 Mar 2003 09:42:37 +0000 Received: from rmk by flint.arm.linux.org.uk with local (Exim 4.12) id 18rEN9-0003NV-00; Fri, 07 Mar 2003 09:42:35 +0000 Date: Fri, 7 Mar 2003 09:42:35 +0000 From: Russell King To: Chris Dukes Cc: Alan Cox , Jeff Garzik , Robin Holt , Linux Kernel Mailing List , netdev@oss.sgi.com Subject: Re: Make ipconfig.c work as a loadable module. Message-ID: <20030307094235.A11807@flint.arm.linux.org.uk> Mail-Followup-To: Chris Dukes , Alan Cox , Jeff Garzik , Robin Holt , Linux Kernel Mailing List , netdev@oss.sgi.com References: <1046990052.18158.121.camel@irongate.swansea.linux.org.uk> <20030306221136.GB26732@gtf.org> <20030306222546.K838@flint.arm.linux.org.uk> <1046996037.18158.142.camel@irongate.swansea.linux.org.uk> <20030306231905.M838@flint.arm.linux.org.uk> <1046996987.17718.144.camel@irongate.swansea.linux.org.uk> <20030307000816.P838@flint.arm.linux.org.uk> <20030307012905.G20725@parcelfarce.linux.theplanet.co.uk> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.2.5.1i In-Reply-To: <20030307012905.G20725@parcelfarce.linux.theplanet.co.uk>; from pakrat@www.uk.linux.org on Fri, Mar 07, 2003 at 01:29:05AM +0000 X-archive-position: 1899 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: rmk@arm.linux.org.uk Precedence: bulk X-list: netdev Content-Length: 2213 Lines: 58 On Fri, Mar 07, 2003 at 01:29:05AM +0000, Chris Dukes wrote: > If IBM can fit a kernel and a ramdisk containing all the utilities you > describe and more in smaller than 5M of file for tftp, one would think > that it could be done on Linux. Wow. 5MB eh? We currently do NFS-root in 690K. > > ipconfig.c does more than just configure networking. It's a far smaller > > solution to NFS-root than any userspace implementation could ever hope > > to be. > > That's nice. Would you mind explaining to us where that would be a > benefit? Aside from dead header space in elf executables, I'm at > a loss as to how a usermode implementation must be significantly > larger than kernel code. If you're suggesting above that "5MB isn't significantly larger than the size Linux can do this" then I think I've just proven you wrong. Lets see - building an ramdisk to mount a root filesystem out of existing binaries would require from my exisitng systems probably something like: text data bss dec hex filename 1093047 21224 15560 1129831 113d67 /lib/libc.so.6 515890 22320 16640 554850 87762 /bin/sh 58540 2436 9776 70752 11460 /lib/libresolv.so.2 53685 1476 5488 60649 ece9 /bin/mount 45511 672 432 46615 b617 /bin/sed 42830 624 40 43494 a9e6 /sbin/pump 10783 500 104 11387 2c7b /lib/libtermcap.so.2 8765 444 28 9237 2415 /lib/libdl.so.2 pump isn't really suitable for the task, but I don't have dhcpcd around. dhcpcd is even larger than pump however. That's getting on for 2MB vs: 2620 2012 0 4632 1218 fs/nfs/nfsroot.o 8016 380 80 8476 211c net/ipv4/ipconfig.o about 13K. Which version is overly bloated? Which version is huge? Which version is compact? Even the klibc ipconfig version is significantly larger than the in-kernel version - and klibc and its binaries are written to be small. Note: I *do* agree that ipconfig.c needs to die before 2.6 but I do not agree that today is the right day. -- Russell King (rmk@arm.linux.org.uk) The developer of ARM Linux http://www.arm.linux.org.uk/personal/aboutme.html From seong@etri.re.kr Fri Mar 7 01:44:42 2003 Received: with ECARTIS (v1.0.0; list netdev); Fri, 07 Mar 2003 01:44:50 -0800 (PST) Received: from cms1.etri.re.kr (cms1.etri.re.kr [129.254.16.11]) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h279i0q9027673 for ; Fri, 7 Mar 2003 01:44:42 -0800 Received: from seong (129.254.172.40 [129.254.172.40]) by cms1.etri.re.kr with SMTP (Microsoft Exchange Internet Mail Service Version 5.5.2653.13) id GCGCDZR9; Fri, 7 Mar 2003 18:43:44 +0900 Message-ID: <005101c2e48e$4016a4f0$28acfe81@seong> From: "Seong Moon" To: Subject: rtnetlink and multicast routing cache ? Date: Fri, 7 Mar 2003 18:45:13 +0900 MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 5.50.4920.2300 X-MimeOLE: Produced By Microsoft MimeOLE V5.50.4920.2300 X-archive-position: 1900 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: seong@etri.re.kr Precedence: bulk X-list: netdev Content-Length: 211 Lines: 12 Hi there! I want to get and monitor multicast routing cache information from kernel through rtnetlink. Is it possible ? I'm using linux2.4.18. If it is possible, What can I do for this ? thanks in advance. From bogdan.costescu@iwr.uni-heidelberg.de Fri Mar 7 03:46:15 2003 Received: with ECARTIS (v1.0.0; list netdev); Fri, 07 Mar 2003 03:46:23 -0800 (PST) Received: from mail.iwr.uni-heidelberg.de (mail.iwr.uni-heidelberg.de [129.206.104.30]) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h27BkDq9017803 for ; Fri, 7 Mar 2003 03:46:14 -0800 Received: from kenzo.iwr.uni-heidelberg.de (kenzo.iwr.uni-heidelberg.de [129.206.120.29]) by mail.iwr.uni-heidelberg.de (8.11.2/8.11.1) with ESMTP id h27BkBP23646; Fri, 7 Mar 2003 12:46:11 +0100 (MET) Received: from localhost (bogdan@localhost) by kenzo.iwr.uni-heidelberg.de (8.11.6/8.11.6) with ESMTP id h27BkBq31730; Fri, 7 Mar 2003 12:46:11 +0100 Date: Fri, 7 Mar 2003 12:46:11 +0100 (CET) From: Bogdan Costescu To: Russell King cc: Chris Dukes , Alan Cox , Jeff Garzik , Robin Holt , Linux Kernel Mailing List , Subject: Re: Make ipconfig.c work as a loadable module. In-Reply-To: <20030307094235.A11807@flint.arm.linux.org.uk> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-archive-position: 1901 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: bogdan.costescu@iwr.uni-heidelberg.de Precedence: bulk X-list: netdev Content-Length: 1706 Lines: 36 On Fri, 7 Mar 2003, Russell King wrote: > Which version is overly bloated? > Which version is huge? > Which version is compact? ... and the size is not important only because we want to make everything smaller, but because of how it's commonly used (at least in the clustering world from which I come): the mainboard BIOS or NIC PROC contains PXE/DHCP client; data is transferred through UDP, with very poor (if any) congestion control. Congestion control means here both extreme situations: if packets don't arrive to the client, it might not ask again, ask only a limited number of times or give up after some timeout; if the server has some faster NIC to be able to handle more such requests, it might also send too fast for a single client which might drop packets. In some cases, if such situation occurs, the client just blocks there printing an error message on the console, without trying to restart the whole process and the only way to make it do something is to press the Reset button or plug in a keyboard... When you have tens or hundreds of such nodes, it's not a pleasure ! Booting a bunch of such nodes would become problematic if they need to transfer more data (=initrd) to start the kernel and so network booting would become less reliable. Please note that I'm not saying "ipconfig has to stay" - just that any solution should not dramatically increase the size of data transferred before the jump to kernel code. -- Bogdan Costescu IWR - Interdisziplinaeres Zentrum fuer Wissenschaftliches Rechnen Universitaet Heidelberg, INF 368, D-69120 Heidelberg, GERMANY Telephone: +49 6221 54 8869, Telefax: +49 6221 54 8868 E-mail: Bogdan.Costescu@IWR.Uni-Heidelberg.De From alan@lxorguk.ukuu.org.uk Fri Mar 7 03:49:09 2003 Received: with ECARTIS (v1.0.0; list netdev); Fri, 07 Mar 2003 03:49:12 -0800 (PST) Received: from irongate.swansea.linux.org.uk (pc2-cwma1-4-cust86.swan.cable.ntl.com [213.105.254.86]) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h27Bn7q9018184 for ; Fri, 7 Mar 2003 03:49:08 -0800 Received: from irongate.swansea.linux.org.uk (localhost [127.0.0.1]) by irongate.swansea.linux.org.uk (8.12.7/8.12.7) with ESMTP id h27CseYf020905; Fri, 7 Mar 2003 12:54:41 GMT Received: (from alan@localhost) by irongate.swansea.linux.org.uk (8.12.7/8.12.7/Submit) id h27CsbnY020903; Fri, 7 Mar 2003 12:54:37 GMT X-Authentication-Warning: irongate.swansea.linux.org.uk: alan set sender to alan@lxorguk.ukuu.org.uk using -f Subject: Re: Make ipconfig.c work as a loadable module. From: Alan Cox To: Michael Mueller Cc: Russell King , Jeff Garzik , Robin Holt , Linux Kernel Mailing List , netdev@oss.sgi.com In-Reply-To: <200303070715.IAA27138@fire.malware.de> References: <1046990052.18158.121.camel@irongate.swansea.linux.org.uk> <20030306221136.GB26732@gtf.org> <20030306222546.K838@flint.arm.linux.org.uk> <1046996037.18158.142.camel@irongate.swansea.linux.org.uk> <20030306231905.M838@flint.arm.linux.org.uk> <1046996987.17718.144.camel@irongate.swansea.linux.org.uk> <200303070715.IAA27138@fire.malware.de> Content-Type: text/plain Content-Transfer-Encoding: 7bit Organization: Message-Id: <1047041676.20793.12.camel@irongate.swansea.linux.org.uk> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.2.1 (1.2.1-4) Date: 07 Mar 2003 12:54:36 +0000 X-archive-position: 1902 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: alan@lxorguk.ukuu.org.uk Precedence: bulk X-list: netdev Content-Length: 499 Lines: 11 On Fri, 2003-03-07 at 07:15, Michael Mueller wrote: > Hi Alan, > Sorry, but I must join Russel here. I have atleast one machine which has > a bootloader able to load exactly one file only. There is currently no > way to load an initrd. It would need to implement the whole (BOOTP+)TFTP > stuff again, just to get the initrd. So I was quite happy linux 2.4 > still knows about mounting a NFS root filesystem without user-space > help. Just glue the initrd to the kernel. This is not rocket science From pakrat@www.linux.org.uk Fri Mar 7 05:38:58 2003 Received: with ECARTIS (v1.0.0; list netdev); Fri, 07 Mar 2003 05:39:04 -0800 (PST) Received: from www.linux.org.uk (IDENT:exim@parcelfarce.linux.theplanet.co.uk [195.92.249.252]) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h27DcEq9023589 for ; Fri, 7 Mar 2003 05:38:58 -0800 Received: from pakrat by www.linux.org.uk with local (Exim 3.33 #5) id 18rI3B-00027C-00; Fri, 07 Mar 2003 13:38:13 +0000 Date: Fri, 7 Mar 2003 13:38:13 +0000 From: Chris Dukes To: Alan Cox , Jeff Garzik , Robin Holt , Linux Kernel Mailing List , netdev@oss.sgi.com Subject: Re: Make ipconfig.c work as a loadable module. Message-ID: <20030307133812.A6676@parcelfarce.linux.theplanet.co.uk> References: <1046990052.18158.121.camel@irongate.swansea.linux.org.uk> <20030306221136.GB26732@gtf.org> <20030306222546.K838@flint.arm.linux.org.uk> <1046996037.18158.142.camel@irongate.swansea.linux.org.uk> <20030306231905.M838@flint.arm.linux.org.uk> <1046996987.17718.144.camel@irongate.swansea.linux.org.uk> <20030307000816.P838@flint.arm.linux.org.uk> <20030307012905.G20725@parcelfarce.linux.theplanet.co.uk> <20030307094235.A11807@flint.arm.linux.org.uk> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.2.5.1i In-Reply-To: <20030307094235.A11807@flint.arm.linux.org.uk>; from rmk@arm.linux.org.uk on Fri, Mar 07, 2003 at 09:42:35AM +0000 X-archive-position: 1903 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: pakrat@www.uk.linux.org Precedence: bulk X-list: netdev Content-Length: 2009 Lines: 49 On Fri, Mar 07, 2003 at 09:42:35AM +0000, Russell King wrote: > On Fri, Mar 07, 2003 at 01:29:05AM +0000, Chris Dukes wrote: > > That's nice. Would you mind explaining to us where that would be a > > benefit? Aside from dead header space in elf executables, I'm at > > a loss as to how a usermode implementation must be significantly > > larger than kernel code. > > If you're suggesting above that "5MB isn't significantly larger than > the size Linux can do this" then I think I've just proven you wrong. The 5Mb example is AIX. > > Lets see - building an ramdisk to mount a root filesystem out of existing > binaries would require from my exisitng systems probably something like: > I said userspace. I did not say existing binaries. [Size comparison of the kitchen sink vs kernel code deleted because it's comparing apples and oranges]. > > Which version is overly bloated? > Which version is huge? > Which version is compact? You are asserting aesthetics instead of benefits. I asked about benefits. Specifically, what is the benefit of compact? I'm sure you have a very good technical or business benefit to compact, but those of us in the world of workstations and servers have zero clue what it may be. Another individual has already indicated a very valid technical merit to having it all in one file. I have the same problem myself. AIX and *BSD have a working approach to that problem. > > Even the klibc ipconfig version is significantly larger than the in-kernel > version - and klibc and its binaries are written to be small. User space solution is not the same as a solution implemented with multiple user space apps. > > Note: I *do* agree that ipconfig.c needs to die before 2.6 but I do not > agree that today is the right day. Perhaps you could explain why today is not the day. (ie, soon to be shipping product that requires it. desire to see a viable userspace solution working before it is removed). -- Chris Dukes I tried being reasonable once--I didn't like it. From rmk@arm.linux.org.uk Fri Mar 7 06:30:10 2003 Received: with ECARTIS (v1.0.0; list netdev); Fri, 07 Mar 2003 06:30:35 -0800 (PST) Received: from caramon.arm.linux.org.uk (caramon.arm.linux.org.uk [212.18.232.186]) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h27ETTq9003855 for ; Fri, 7 Mar 2003 06:30:10 -0800 Received: from flint.arm.linux.org.uk ([3ffe:8260:2002:1:201:2ff:fe14:8fad]) by caramon.arm.linux.org.uk with asmtp (TLSv1:DES-CBC3-SHA:168) (Exim 4.12) id 18rIqf-0003qp-00; Fri, 07 Mar 2003 14:29:21 +0000 Received: from rmk by flint.arm.linux.org.uk with local (Exim 4.12) id 18rIqe-00061Q-00; Fri, 07 Mar 2003 14:29:20 +0000 Date: Fri, 7 Mar 2003 14:29:20 +0000 From: Russell King To: Chris Dukes Cc: Alan Cox , Jeff Garzik , Robin Holt , Linux Kernel Mailing List , netdev@oss.sgi.com Subject: Re: Make ipconfig.c work as a loadable module. Message-ID: <20030307142920.F17492@flint.arm.linux.org.uk> Mail-Followup-To: Chris Dukes , Alan Cox , Jeff Garzik , Robin Holt , Linux Kernel Mailing List , netdev@oss.sgi.com References: <1046990052.18158.121.camel@irongate.swansea.linux.org.uk> <20030306221136.GB26732@gtf.org> <20030306222546.K838@flint.arm.linux.org.uk> <1046996037.18158.142.camel@irongate.swansea.linux.org.uk> <20030306231905.M838@flint.arm.linux.org.uk> <1046996987.17718.144.camel@irongate.swansea.linux.org.uk> <20030307000816.P838@flint.arm.linux.org.uk> <20030307012905.G20725@parcelfarce.linux.theplanet.co.uk> <20030307094235.A11807@flint.arm.linux.org.uk> <20030307133812.A6676@parcelfarce.linux.theplanet.co.uk> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.2.5.1i In-Reply-To: <20030307133812.A6676@parcelfarce.linux.theplanet.co.uk>; from pakrat@www.uk.linux.org on Fri, Mar 07, 2003 at 01:38:13PM +0000 X-archive-position: 1904 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: rmk@arm.linux.org.uk Precedence: bulk X-list: netdev Content-Length: 2102 Lines: 50 On Fri, Mar 07, 2003 at 01:38:13PM +0000, Chris Dukes wrote: > You are asserting aesthetics instead of benefits. I asked about benefits. > Specifically, what is the benefit of compact? Think _embedded_. Think "cost of flash chips". Think "not everything has a floppy disk". > I'm sure you have a very good technical or business benefit to compact, I'm sorry, believe it or not, but I'm not swayed by "business benefits" here. Although I have my own business in the UK, we as a business are currently involved in hardware design which has nothing to do with the points I'm raising here. > but those of us in the world of workstations and servers have zero clue > what it may be. Indeed and understandable. > User space solution is not the same as a solution implemented with > multiple user space apps. I've been working on klibc to work towards providing such a solution. I know what it involves, and I know that this solution isn't there yet. Also, the fundamentals of klibc have not been accepted by Linus, so we don't even know if this is going to be a solution yet. > > Note: I *do* agree that ipconfig.c needs to die before 2.6 but I do not > > agree that today is the right day. > > Perhaps you could explain why today is not the day. > (ie, soon to be shipping product that requires it. desire to see a viable > userspace solution working before it is removed). Just about every ARM kernel development downloads kernels via XMODEM and the ability to bring networking up and mount a NFS-root filesystem is by fair the easiest way to develop on *any* embedded device with Ethernet. I suppose you could say I have a _community_ interest here - an interest in ensuring that the ARM community has the resources to be able to continue using Linux. So, while the big server people run around removing functionality they don't need, they make other parts of the community suffer. Is that really what Open Source is about? Suffering? 8) -- Russell King (rmk@arm.linux.org.uk) The developer of ARM Linux http://www.arm.linux.org.uk/personal/aboutme.html From root@chaos.analogic.com Fri Mar 7 08:23:17 2003 Received: with ECARTIS (v1.0.0; list netdev); Fri, 07 Mar 2003 08:23:21 -0800 (PST) Received: from chaos.analogic.com (chaos.analogic.com [204.178.40.224]) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h27GMaq9019749 for ; Fri, 7 Mar 2003 08:23:17 -0800 Received: (from root@localhost) by chaos.analogic.com (8.11.0.Beta3(chaos.analogic.com)/8.12.0.A) id h27GNur15326; Fri, 7 Mar 2003 11:23:56 -0500 Date: Fri, 7 Mar 2003 11:23:56 -0500 (EST) From: "Richard B. Johnson" X-Sender: root@chaos Reply-To: root@chaos.analogic.com To: Russell King cc: Chris Dukes , Alan Cox , Jeff Garzik , Robin Holt , Linux Kernel Mailing List , netdev@oss.sgi.com Subject: Re: Make ipconfig.c work as a loadable module. In-Reply-To: <20030307142920.F17492@flint.arm.linux.org.uk> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-archive-position: 1905 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: root@chaos.analogic.com Precedence: bulk X-list: netdev Content-Length: 3701 Lines: 85 On Fri, 7 Mar 2003, Russell King wrote: > On Fri, Mar 07, 2003 at 01:38:13PM +0000, Chris Dukes wrote: > > You are asserting aesthetics instead of benefits. I asked about benefits. > > Specifically, what is the benefit of compact? > > Think _embedded_. Think "cost of flash chips". Think "not everything > has a floppy disk". > > > I'm sure you have a very good technical or business benefit to compact, > > I'm sorry, believe it or not, but I'm not swayed by "business benefits" > here. Although I have my own business in the UK, we as a business are > currently involved in hardware design which has nothing to do with the > points I'm raising here. > > > but those of us in the world of workstations and servers have zero clue > > what it may be. > > Indeed and understandable. > > > User space solution is not the same as a solution implemented with > > multiple user space apps. > > I've been working on klibc to work towards providing such a solution. > I know what it involves, and I know that this solution isn't there yet. > Also, the fundamentals of klibc have not been accepted by Linus, so we > don't even know if this is going to be a solution yet. > > > > Note: I *do* agree that ipconfig.c needs to die before 2.6 but I do not > > > agree that today is the right day. > > > > Perhaps you could explain why today is not the day. > > (ie, soon to be shipping product that requires it. desire to see a viable > > userspace solution working before it is removed). > > Just about every ARM kernel development downloads kernels via XMODEM > and the ability to bring networking up and mount a NFS-root filesystem > is by fair the easiest way to develop on *any* embedded device with > Ethernet. > > I suppose you could say I have a _community_ interest here - an interest > in ensuring that the ARM community has the resources to be able to continue > using Linux. > > So, while the big server people run around removing functionality they > don't need, they make other parts of the community suffer. Is that > really what Open Source is about? Suffering? 8) > > -- > Russell King (rmk@arm.linux.org.uk) The developer of ARM Linux > http://www.arm.linux.org.uk/personal/aboutme.html > - As the kernel changes there are some things that really need to remain. You need to be able to boot from a "floppy" disk. Yes, now-days it's probably not a real floppy, but a BIOS module that emulates a floppy. A lot of people don't realilize that this is how a CD/ROM is booted! The BIOS configures it to "look" like a floppy for the purpose of booting. A "bootable" CD/ROM has as its first partition, the image of a floppy disk. Also, many embeded systems boot from a "RAM" disk that emulates a floppy disk for the purpose of booting. In fact, there is a good argument to make virtually all embeded systems that use the same CPU as the development environment, boot this way. You can design, code, and test the whole damn thing while the hardware engineers are still laying out components. One such RAM disk on our equipment, pages in "sectors" through a tiny (0x1000) window which disappears after booting, therefore no address-space is given up to some NVRAM. Linux is unmodified, thinking it was booted from a 1.44 MB floppy. If the kernel grows to where this can't be done anymore, then embeded systems will not use modern kernels. It's that simple. So, increased functionality really needs to be put into modules so that the basic kernel doesn't continue to increase in size. Cheers, Dick Johnson Penguin : Linux version 2.4.18 on an i686 machine (797.90 BogoMips). Why is the government concerned about the lunatic fringe? Think about it. From malware@t-online.de Fri Mar 7 13:34:46 2003 Received: with ECARTIS (v1.0.0; list netdev); Fri, 07 Mar 2003 13:34:50 -0800 (PST) Received: from mailout02.sul.t-online.com (mailout02.sul.t-online.com [194.25.134.17]) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h27LYiq9014265 for ; Fri, 7 Mar 2003 13:34:45 -0800 Received: from fwd09.sul.t-online.de by mailout02.sul.t-online.com with smtp id 18rPUJ-0006oF-02; Fri, 07 Mar 2003 22:34:43 +0100 Received: from fire.malware.de (320008702754-0001@[193.158.189.2]) by fwd09.sul.t-online.com with esmtp id 18rPU8-11QWQKC; Fri, 7 Mar 2003 22:34:32 +0100 Message-Id: <200303072132.WAA02244@fire.malware.de> Date: Fri, 07 Mar 2003 22:33:07 +0100 From: malware@t-online.de (Michael Mueller) X-Mailer: Mozilla 4.78 [en] (X11; U; Linux 2.4.20-pre8 i686) X-Accept-Language: en, de MIME-Version: 1.0 To: Alan Cox CC: Russell King , Jeff Garzik , Robin Holt , Linux Kernel Mailing List , netdev@oss.sgi.com Subject: Re: Make ipconfig.c work as a loadable module. References: <1046990052.18158.121.camel@irongate.swansea.linux.org.uk> <20030306221136.GB26732@gtf.org> <20030306222546.K838@flint.arm.linux.org.uk> <1046996037.18158.142.camel@irongate.swansea.linux.org.uk> <20030306231905.M838@flint.arm.linux.org.uk> <1046996987.17718.144.camel@irongate.swansea.linux.org.uk> <200303070715.IAA27138@fire.malware.de> <1047041676.20793.12.camel@irongate.swansea.linux.org.uk> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-Sender: 320008702754-0001@t-dialin.net X-archive-position: 1906 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: malware@t-online.de Precedence: bulk X-list: netdev Content-Length: 901 Lines: 26 Hi Alan, you wrote: > > Sorry, but I must join Russel here. I have atleast one machine which has > > a bootloader able to load exactly one file only. There is currently no > > way to load an initrd. It would need to implement the whole (BOOTP+)TFTP > > stuff again, just to get the initrd. So I was quite happy linux 2.4 > > still knows about mounting a NFS root filesystem without user-space > > help. > > Just glue the initrd to the kernel. This is not rocket science Do you have a sort of glue fixing the ramdisk support on m68k to support physically non-continous memory too? Otherwhise I have only 1 MiB for the whole initrd. So hopefully the removal of ipconfig.c, if decided for, does not propagate back into the 2.4 series. It would add a heap of useless work to do, just to get it up again. Michael -- Linux@TekXpress http://www-users.rwth-aachen.de/Michael.Mueller4/tekxp/tekxp.html From wli@holomorphy.com Fri Mar 7 13:48:20 2003 Received: with ECARTIS (v1.0.0; list netdev); Fri, 07 Mar 2003 13:48:22 -0800 (PST) Received: from holomorphy (mail@holomorphy.com [66.224.33.161]) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h27LmHq9015257 for ; Fri, 7 Mar 2003 13:48:20 -0800 Received: from wli by holomorphy with local (Exim 3.35 #1 (Debian)) id 18rPgz-0006BU-00; Fri, 07 Mar 2003 13:47:49 -0800 Date: Fri, 7 Mar 2003 13:47:49 -0800 From: William Lee Irwin III To: Chris Dukes , Alan Cox , Jeff Garzik , Robin Holt , Linux Kernel Mailing List , netdev@oss.sgi.com Subject: Re: Make ipconfig.c work as a loadable module. Message-ID: <20030307214749.GA20188@holomorphy.com> Mail-Followup-To: William Lee Irwin III , Chris Dukes , Alan Cox , Jeff Garzik , Robin Holt , Linux Kernel Mailing List , netdev@oss.sgi.com References: <1046990052.18158.121.camel@irongate.swansea.linux.org.uk> <20030306221136.GB26732@gtf.org> <20030306222546.K838@flint.arm.linux.org.uk> <1046996037.18158.142.camel@irongate.swansea.linux.org.uk> <20030306231905.M838@flint.arm.linux.org.uk> <1046996987.17718.144.camel@irongate.swansea.linux.org.uk> <20030307000816.P838@flint.arm.linux.org.uk> <20030307012905.G20725@parcelfarce.linux.theplanet.co.uk> <20030307094235.A11807@flint.arm.linux.org.uk> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20030307094235.A11807@flint.arm.linux.org.uk> User-Agent: Mutt/1.3.28i Organization: The Domain of Holomorphy X-archive-position: 1907 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: wli@holomorphy.com Precedence: bulk X-list: netdev Content-Length: 418 Lines: 12 On Fri, Mar 07, 2003 at 09:42:35AM +0000, Russell King wrote: > That's getting on for 2MB vs: > 2620 2012 0 4632 1218 fs/nfs/nfsroot.o > 8016 380 80 8476 211c net/ipv4/ipconfig.o > about 13K. There's a cap on the maximum size of things various bootloaders can load via tftp; 2MB is relatively certain to blow it. ISTR the limit being something near 1MB for 2 of my boxen. -- wli From cfriesen@nortelnetworks.com Fri Mar 7 14:01:48 2003 Received: with ECARTIS (v1.0.0; list netdev); Fri, 07 Mar 2003 14:01:52 -0800 (PST) Received: from zcars04e.nortelnetworks.com (zcars04e.nortelnetworks.com [47.129.242.56]) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h27M1kq9018166 for ; Fri, 7 Mar 2003 14:01:47 -0800 Received: from zcard307.ca.nortel.com (zcard307.ca.nortel.com [47.129.242.67]) by zcars04e.nortelnetworks.com (Switch-2.2.5/Switch-2.2.0) with ESMTP id h27M0G420428; Fri, 7 Mar 2003 17:00:16 -0500 (EST) Received: from zcard0k6.ca.nortel.com ([47.129.242.158]) by zcard307.ca.nortel.com with SMTP (Microsoft Exchange Internet Mail Service Version 5.5.2653.13) id GDFBJSWB; Fri, 7 Mar 2003 17:00:16 -0500 Received: from pcard0ks.ca.nortel.com ([47.129.117.131]) by zcard0k6.ca.nortel.com with SMTP (Microsoft Exchange Internet Mail Service Version 5.5.2653.13) id FSL7Z1ND; Fri, 7 Mar 2003 17:00:17 -0500 Received: from nortelnetworks.com (localhost.localdomain [127.0.0.1]) by pcard0ks.ca.nortel.com (Postfix) with ESMTP id 881ED2D957; Fri, 7 Mar 2003 17:00:15 -0500 (EST) Message-ID: <3E69166F.9080604@nortelnetworks.com> Date: Fri, 07 Mar 2003 17:00:15 -0500 X-Sybari-Space: 00000000 00000000 00000000 From: Chris Friesen User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:0.9.8) Gecko/20020204 X-Accept-Language: en-us MIME-Version: 1.0 To: William Lee Irwin III Cc: Chris Dukes , Alan Cox , Jeff Garzik , Robin Holt , Linux Kernel Mailing List , netdev@oss.sgi.com Subject: Re: Make ipconfig.c work as a loadable module. References: <1046990052.18158.121.camel@irongate.swansea.linux.org.uk> <20030306221136.GB26732@gtf.org> <20030306222546.K838@flint.arm.linux.org.uk> <1046996037.18158.142.camel@irongate.swansea.linux.org.uk> <20030306231905.M838@flint.arm.linux.org.uk> <1046996987.17718.144.camel@irongate.swansea.linux.org.uk> <20030307000816.P838@flint.arm.linux.org.uk> <20030307012905.G20725@parcelfarce.linux.theplanet.co.uk> <20030307094235.A11807@flint.arm.linux.org.uk> <20030307214749.GA20188@holomorphy.com> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 1908 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: cfriesen@nortelnetworks.com Precedence: bulk X-list: netdev Content-Length: 981 Lines: 28 William Lee Irwin III wrote: > On Fri, Mar 07, 2003 at 09:42:35AM +0000, Russell King wrote: > >>That's getting on for 2MB vs: >> 2620 2012 0 4632 1218 fs/nfs/nfsroot.o >> 8016 380 80 8476 211c net/ipv4/ipconfig.o >>about 13K. >> > > There's a cap on the maximum size of things various bootloaders can > load via tftp; 2MB is relatively certain to blow it. ISTR the limit > being something near 1MB for 2 of my boxen. Since this is totally machine/architecture specific (we're tftp'ing 10MB kernel/ramdisk images to embedded PPC machines here) it might be a good idea to ask around and find what the most restrictive requirements are. Is 1MB the worst-case or does it get even tighter? Chris -- Chris Friesen | MailStop: 043/33/F10 Nortel Networks | work: (613) 765-0557 3500 Carling Avenue | fax: (613) 765-2986 Nepean, ON K2H 8E9 Canada | email: cfriesen@nortelnetworks.com From ebiederm@xmission.com Fri Mar 7 18:04:01 2003 Received: with ECARTIS (v1.0.0; list netdev); Fri, 07 Mar 2003 18:04:06 -0800 (PST) Received: from frodo.biederman.org (ebiederm.dsl.xmission.com [166.70.28.69]) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h2823xq9029628 for ; Fri, 7 Mar 2003 18:04:01 -0800 Received: (from eric@localhost) by frodo.biederman.org (8.9.3/8.9.3) id TAA15566; Fri, 7 Mar 2003 19:03:24 -0700 To: Bogdan Costescu Cc: Russell King , Chris Dukes , Alan Cox , Jeff Garzik , Robin Holt , Linux Kernel Mailing List , Subject: Re: Make ipconfig.c work as a loadable module. References: From: ebiederm@xmission.com (Eric W. Biederman) Date: 07 Mar 2003 19:03:24 -0700 In-Reply-To: Message-ID: User-Agent: Gnus/5.09 (Gnus v5.9.0) Emacs/21.1 MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-archive-position: 1909 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: ebiederm@xmission.com Precedence: bulk X-list: netdev Content-Length: 2070 Lines: 43 Bogdan Costescu writes: > On Fri, 7 Mar 2003, Russell King wrote: > > > Which version is overly bloated? > > Which version is huge? > > Which version is compact? > > ... and the size is not important only because we want to make everything > smaller, but because of how it's commonly used (at least in the clustering > world from which I come): > > the mainboard BIOS or NIC PROC contains PXE/DHCP client; data is > transferred through UDP, with very poor (if any) congestion control. Only because the implementations suck. See etherboot. > Congestion control means here both extreme situations: if packets don't > arrive to the client, it might not ask again, ask only a limited number of > times or give up after some timeout; if the server has some faster NIC to > be able to handle more such requests, it might also send too fast for a > single client which might drop packets. In some cases, if such situation > occurs, the client just blocks there printing an error message on the > console, without trying to restart the whole process and the only way to > make it do something is to press the Reset button or plug in a keyboard... > When you have tens or hundreds of such nodes, it's not a pleasure ! But this is all before the kernel is loaded. Having booted a 1000 node cluster with TFTP and DHCP. From a single host with even being in the same town, I think I have some room to talk. > Booting a bunch of such nodes would become problematic if they need > to transfer more data (=initrd) to start the kernel and so network booting > would become less reliable. Please note that I'm not saying "ipconfig has > to stay" - just that any solution should not dramatically increase the > size of data transferred before the jump to kernel code. Right. But I would suggest fixing your NBP (what PXE load) which must be < 64K anyway if you have noticeable reliability problems. Not that I even suggest using PXE for production use anyway. But sometimes you are stuck with what you can do. Eric From bogdan.costescu@iwr.uni-heidelberg.de Sat Mar 8 02:45:46 2003 Received: with ECARTIS (v1.0.0; list netdev); Sat, 08 Mar 2003 02:45:48 -0800 (PST) Received: from mail.iwr.uni-heidelberg.de (mail.iwr.uni-heidelberg.de [129.206.104.30]) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h28Ajiq9028107 for ; Sat, 8 Mar 2003 02:45:45 -0800 Received: from kenzo.iwr.uni-heidelberg.de (kenzo.iwr.uni-heidelberg.de [129.206.120.29]) by mail.iwr.uni-heidelberg.de (8.11.2/8.11.1) with ESMTP id h28AjgX21889; Sat, 8 Mar 2003 11:45:42 +0100 (MET) Received: from localhost (bogdan@localhost) by kenzo.iwr.uni-heidelberg.de (8.11.6/8.11.6) with ESMTP id h28Ajfd12460; Sat, 8 Mar 2003 11:45:41 +0100 Date: Sat, 8 Mar 2003 11:45:40 +0100 (CET) From: Bogdan Costescu To: "Eric W. Biederman" cc: Russell King , Chris Dukes , Alan Cox , Jeff Garzik , Robin Holt , Linux Kernel Mailing List , Subject: Re: Make ipconfig.c work as a loadable module. In-Reply-To: Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-archive-position: 1910 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: bogdan.costescu@iwr.uni-heidelberg.de Precedence: bulk X-list: netdev Content-Length: 1192 Lines: 33 On 7 Mar 2003, Eric W. Biederman wrote: > Only because the implementations suck. See etherboot. Agreed, but as you rightly say at the end of your message... > But sometimes you are stuck with what you can do. ... and you can't go use etherboot or whatever, you have to deal with it. You can deal with it today because ipconfig is small, you might not be able to deal with it tomorrow if you'll have to transfer twice as much because of a big initrd. > But this is all before the kernel is loaded. But that's exactly my point. The ipconfig functionality is needed and what I ask for is that whatever means (if any) are chosen to replace it, they should keep the low size. > Having booted a 1000 node cluster with TFTP and DHCP. I do not doubt this, but I'm afraid that you (or we) might not be able to do it again tomorrow. And probably this is an ideal case where you have used the better solution as client (etherboot)... -- Bogdan Costescu IWR - Interdisziplinaeres Zentrum fuer Wissenschaftliches Rechnen Universitaet Heidelberg, INF 368, D-69120 Heidelberg, GERMANY Telephone: +49 6221 54 8869, Telefax: +49 6221 54 8868 E-mail: Bogdan.Costescu@IWR.Uni-Heidelberg.De From dirch221@yahoo.co.in Sat Mar 8 02:50:40 2003 Received: with ECARTIS (v1.0.0; list netdev); Sat, 08 Mar 2003 02:50:42 -0800 (PST) Received: from web8205.mail.in.yahoo.com (web8205.mail.in.yahoo.com [203.199.70.126]) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h28Anwq9028527 for ; Sat, 8 Mar 2003 02:50:39 -0800 Message-ID: <20030308104950.66215.qmail@web8205.mail.in.yahoo.com> Received: from [202.54.65.65] by web8205.mail.in.yahoo.com via HTTP; Sat, 08 Mar 2003 10:49:50 GMT Date: Sat, 8 Mar 2003 10:49:50 +0000 (GMT) From: =?iso-8859-1?q?barkkarn=20aravinda?= Subject: protocol development To: netdev@oss.sgi.com MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="0-1071572380-1047120590=:63048" Content-Transfer-Encoding: 8bit X-archive-position: 1911 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: dirch221@yahoo.co.in Precedence: bulk X-list: netdev Content-Length: 1086 Lines: 22 --0-1071572380-1047120590=:63048 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: 8bit hi iam arvind from india.i want to make some changes to tcp protocol and implement in my computer.can i use libpcap to write my own protocol.give me some suggestion about how to work on this project.what are the books that can help me in this project. bye Catch all the cricket action. Download Yahoo! Score tracker --0-1071572380-1047120590=:63048 Content-Type: text/html; charset=iso-8859-1 Content-Transfer-Encoding: 8bit

hi

   iam arvind from india.i want to make some changes to tcp protocol and implement in my computer.can i use libpcap to write my own protocol.give me some suggestion about how to work on this project.what are the books that can help me in this project.

bye

Catch all the cricket action. Download Yahoo! Score tracker --0-1071572380-1047120590=:63048-- From ebiederm@xmission.com Sat Mar 8 08:08:22 2003 Received: with ECARTIS (v1.0.0; list netdev); Sat, 08 Mar 2003 08:08:24 -0800 (PST) Received: from frodo.biederman.org (ebiederm.dsl.xmission.com [166.70.28.69]) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h28G7eq9008014 for ; Sat, 8 Mar 2003 08:08:21 -0800 Received: (from eric@localhost) by frodo.biederman.org (8.9.3/8.9.3) id JAA17848; Sat, 8 Mar 2003 09:07:11 -0700 To: Bogdan Costescu Cc: Russell King , Chris Dukes , Alan Cox , Jeff Garzik , Robin Holt , Linux Kernel Mailing List , Subject: Re: Make ipconfig.c work as a loadable module. References: From: ebiederm@xmission.com (Eric W. Biederman) Date: 08 Mar 2003 09:07:11 -0700 In-Reply-To: Message-ID: User-Agent: Gnus/5.09 (Gnus v5.9.0) Emacs/21.1 MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-archive-position: 1912 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: ebiederm@xmission.com Precedence: bulk X-list: netdev Content-Length: 2280 Lines: 62 Bogdan Costescu writes: > On 7 Mar 2003, Eric W. Biederman wrote: > > > Only because the implementations suck. See etherboot. > > Agreed, but as you rightly say at the end of your message... > > > But sometimes you are stuck with what you can do. > > ... and you can't go use etherboot or whatever, you have to deal with it. At the very least I can use etherboot as a NBP in PXE terms. So I have a reasonable client after the first tftp transaction. > You can deal with it today because ipconfig is small, you might not be > able to deal with it tomorrow if you'll have to transfer twice as much > because of a big initrd. I routinely support an initrd with: glibc. /bin/bash dhclient mke2fs mkreiserfs parted sfdisk mount pivot_root etc. (All binaries were striped though). And I usually have to pass an ramdisk_size=XXX option to the kernel or my decompressed initial ramdisk is to large. I use it for setting up a local filesystem on a cluster node. And I was able to setup an entire cluster 1000 node cluster in about 15-20 minutes. (Multicast cuts down on the bandwidth requirements which is very nice). With a good bootloader it does not much how big your initrd is. I totally agree that small is good and important. At the same time ipconfig.c is wrong. It is great during development and on systems with a single NIC. But the hard coded policies can be bad for production systems. Not that hard coded policies are bad in general just the kernel is the wrong place to put them. > > But this is all before the kernel is loaded. > > But that's exactly my point. The ipconfig functionality is needed and what > I ask for is that whatever means (if any) are chosen to replace it, they > should keep the low size. Similar functionality is definitely needed. > > > Having booted a 1000 node cluster with TFTP and DHCP. > > I do not doubt this, but I'm afraid that you (or we) might not be able to > do it again tomorrow. And probably this is an ideal case where you have > used the better solution as client (etherboot)... True. But when things are important and the there is GPL'd firmware available that actually works properly. It is worth putting it on the requirements list of things to do. Eric From rmk@arm.linux.org.uk Sat Mar 8 08:19:49 2003 Received: with ECARTIS (v1.0.0; list netdev); Sat, 08 Mar 2003 08:19:51 -0800 (PST) Received: from caramon.arm.linux.org.uk (caramon.arm.linux.org.uk [212.18.232.186]) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h28GJkq9008418 for ; Sat, 8 Mar 2003 08:19:48 -0800 Received: from flint.arm.linux.org.uk ([3ffe:8260:2002:1:201:2ff:fe14:8fad]) by caramon.arm.linux.org.uk with asmtp (TLSv1:DES-CBC3-SHA:168) (Exim 4.12) id 18rh2w-0000hK-00; Sat, 08 Mar 2003 16:19:38 +0000 Received: from rmk by flint.arm.linux.org.uk with local (Exim 4.12) id 18rh2v-0002Ad-00; Sat, 08 Mar 2003 16:19:37 +0000 Date: Sat, 8 Mar 2003 16:19:36 +0000 From: Russell King To: "Eric W. Biederman" Cc: Bogdan Costescu , Chris Dukes , Alan Cox , Jeff Garzik , Robin Holt , Linux Kernel Mailing List , netdev@oss.sgi.com Subject: Re: Make ipconfig.c work as a loadable module. Message-ID: <20030308161936.C1896@flint.arm.linux.org.uk> Mail-Followup-To: "Eric W. Biederman" , Bogdan Costescu , Chris Dukes , Alan Cox , Jeff Garzik , Robin Holt , Linux Kernel Mailing List , netdev@oss.sgi.com References: Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.2.5.1i In-Reply-To: ; from ebiederm@xmission.com on Sat, Mar 08, 2003 at 09:07:11AM -0700 X-archive-position: 1913 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: rmk@arm.linux.org.uk Precedence: bulk X-list: netdev Content-Length: 1979 Lines: 45 On Sat, Mar 08, 2003 at 09:07:11AM -0700, Eric W. Biederman wrote: > With a good bootloader it does not much how big your initrd is. I > totally agree that small is good and important. At the same time > ipconfig.c is wrong. It is great during development and on systems > with a single NIC. But the hard coded policies can be bad for > production systems. Not that hard coded policies are bad in general > just the kernel is the wrong place to put them. With multi-NIC systems, it is perfectly possible to use ipconfig.c with one specific interface. /* * Decode any IP configuration options in the "ip=" or "nfsaddrs=" kernel * command line parameter. It consists of option fields separated by colons in * the following order: * * :::::: * * Any of the fields can be empty which means to use a default value: * - address given by BOOTP or RARP * - address of host returning BOOTP or RARP packet * - none, or the address returned by BOOTP * - automatically determined from , or the * one returned by BOOTP * - in ASCII notation, or the name returned * by BOOTP * - use all available devices * : * off|none - don't do autoconfig at all (DEFAULT) * on|any - use any configured protocol * dhcp|bootp|rarp - use only the specified protocol * both - use both BOOTP and RARP (not DHCP) */ ip=:::::eth0:dhcp (I haven't actually tried this though.) However, how do you configure your ramdisk via the boot loader to use a specific NIC / mount a specific filesystem, etc? -- Russell King (rmk@arm.linux.org.uk) The developer of ARM Linux http://www.arm.linux.org.uk/personal/aboutme.html From ebiederm@xmission.com Sat Mar 8 08:48:51 2003 Received: with ECARTIS (v1.0.0; list netdev); Sat, 08 Mar 2003 08:48:54 -0800 (PST) Received: from frodo.biederman.org (ebiederm.dsl.xmission.com [166.70.28.69]) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h28Gmoq9008957 for ; Sat, 8 Mar 2003 08:48:51 -0800 Received: (from eric@localhost) by frodo.biederman.org (8.9.3/8.9.3) id JAA17979; Sat, 8 Mar 2003 09:48:16 -0700 To: Russell King Cc: Bogdan Costescu , Chris Dukes , Alan Cox , Jeff Garzik , Robin Holt , Linux Kernel Mailing List , netdev@oss.sgi.com Subject: Re: Make ipconfig.c work as a loadable module. References: <20030308161936.C1896@flint.arm.linux.org.uk> From: ebiederm@xmission.com (Eric W. Biederman) Date: 08 Mar 2003 09:48:16 -0700 In-Reply-To: <20030308161936.C1896@flint.arm.linux.org.uk> Message-ID: User-Agent: Gnus/5.09 (Gnus v5.9.0) Emacs/21.1 MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-archive-position: 1914 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: ebiederm@xmission.com Precedence: bulk X-list: netdev Content-Length: 1449 Lines: 36 Russell King writes: > On Sat, Mar 08, 2003 at 09:07:11AM -0700, Eric W. Biederman wrote: > > With a good bootloader it does not much how big your initrd is. I > > totally agree that small is good and important. At the same time > > ipconfig.c is wrong. It is great during development and on systems > > with a single NIC. But the hard coded policies can be bad for > > production systems. Not that hard coded policies are bad in general > > just the kernel is the wrong place to put them. > > With multi-NIC systems, it is perfectly possible to use ipconfig.c with > one specific interface. Sorry. I expressed that wrong. It is not multi-NIC that ipconfig.c gets wrong. It is multiple DHCP servers. You just get multiple dhcp servers when you have multiple NICs. The policies in ipconfig.c are quite good, they just are not universally applicable. But as ipconfig.c is in the kernel it tends to get used where it is inappropriate. > ip=:::::eth0:dhcp > > (I haven't actually tried this though.) I had forgotten about that one, and I believe it helps in some cases. > However, how do you configure your ramdisk via the boot loader to use > a specific NIC / mount a specific filesystem, etc? I can change the contents of my ramdisk as easily as I can change the kernel command line. For the complex setups just placing a configuration file in the ramdisk is what seems to work the best in practice. Eric From rmk@arm.linux.org.uk Sat Mar 8 09:05:45 2003 Received: with ECARTIS (v1.0.0; list netdev); Sat, 08 Mar 2003 09:05:48 -0800 (PST) Received: from caramon.arm.linux.org.uk (caramon.arm.linux.org.uk [212.18.232.186]) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h28H5gq9009426 for ; Sat, 8 Mar 2003 09:05:44 -0800 Received: from flint.arm.linux.org.uk ([3ffe:8260:2002:1:201:2ff:fe14:8fad]) by caramon.arm.linux.org.uk with asmtp (TLSv1:DES-CBC3-SHA:168) (Exim 4.12) id 18rhlO-0000rO-00; Sat, 08 Mar 2003 17:05:34 +0000 Received: from rmk by flint.arm.linux.org.uk with local (Exim 4.12) id 18rhlN-0002bB-00; Sat, 08 Mar 2003 17:05:33 +0000 Date: Sat, 8 Mar 2003 17:05:32 +0000 From: Russell King To: "Eric W. Biederman" Cc: Bogdan Costescu , Chris Dukes , Alan Cox , Jeff Garzik , Robin Holt , Linux Kernel Mailing List , netdev@oss.sgi.com Subject: Re: Make ipconfig.c work as a loadable module. Message-ID: <20030308170532.D1896@flint.arm.linux.org.uk> Mail-Followup-To: "Eric W. Biederman" , Bogdan Costescu , Chris Dukes , Alan Cox , Jeff Garzik , Robin Holt , Linux Kernel Mailing List , netdev@oss.sgi.com References: <20030308161936.C1896@flint.arm.linux.org.uk> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.2.5.1i In-Reply-To: ; from ebiederm@xmission.com on Sat, Mar 08, 2003 at 09:48:16AM -0700 X-archive-position: 1915 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: rmk@arm.linux.org.uk Precedence: bulk X-list: netdev Content-Length: 1018 Lines: 21 On Sat, Mar 08, 2003 at 09:48:16AM -0700, Eric W. Biederman wrote: > I can change the contents of my ramdisk as easily as I can change > the kernel command line. For the complex setups just placing > a configuration file in the ramdisk is what seems to work the best > in practice. You'll forgive me if I don't think that "change the contents of ramdisk" is as easy as changing the kernel command line. Last time I checked, to change the contents of a ramdisk image, you needed to ungzip it, mount it, make some changes, unmount it, re-gzip it, and re-install the thing. Or, in the case of initramfs, you need to rebuild the kernel image. Compare this to changing the kernel command line from "root=/dev/hda1" to "root=/dev/nfs ip=dhcp" in the boot loader by hitting a few keys on the keyboard before the kernel loads, and I think you'll start to get my point here. -- Russell King (rmk@arm.linux.org.uk) The developer of ARM Linux http://www.arm.linux.org.uk/personal/aboutme.html From ebiederm@xmission.com Sat Mar 8 12:51:40 2003 Received: with ECARTIS (v1.0.0; list netdev); Sat, 08 Mar 2003 12:51:43 -0800 (PST) Received: from frodo.biederman.org (ebiederm.dsl.xmission.com [166.70.28.69]) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h28Koxq9011235 for ; Sat, 8 Mar 2003 12:51:40 -0800 Received: (from eric@localhost) by frodo.biederman.org (8.9.3/8.9.3) id LAA18149; Sat, 8 Mar 2003 11:01:06 -0700 To: Russell King Cc: Bogdan Costescu , Chris Dukes , Alan Cox , Jeff Garzik , Robin Holt , Linux Kernel Mailing List , netdev@oss.sgi.com Subject: Re: Make ipconfig.c work as a loadable module. References: <20030308161936.C1896@flint.arm.linux.org.uk> <20030308170532.D1896@flint.arm.linux.org.uk> From: ebiederm@xmission.com (Eric W. Biederman) Date: 08 Mar 2003 11:01:05 -0700 In-Reply-To: <20030308170532.D1896@flint.arm.linux.org.uk> Message-ID: User-Agent: Gnus/5.09 (Gnus v5.9.0) Emacs/21.1 MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-archive-position: 1916 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: ebiederm@xmission.com Precedence: bulk X-list: netdev Content-Length: 3108 Lines: 73 Russell King writes: > On Sat, Mar 08, 2003 at 09:48:16AM -0700, Eric W. Biederman wrote: > > I can change the contents of my ramdisk as easily as I can change > > the kernel command line. For the complex setups just placing > > a configuration file in the ramdisk is what seems to work the best > > in practice. > > You'll forgive me if I don't think that "change the contents of ramdisk" > is as easy as changing the kernel command line. > > Last time I checked, to change the contents of a ramdisk image, you needed > to ungzip it, mount it, make some changes, unmount it, re-gzip it, and > re-install the thing. Or, in the case of initramfs, you need to rebuild > the kernel image. Compare this to changing the kernel command line from > "root=/dev/hda1" to "root=/dev/nfs ip=dhcp" in the boot loader by hitting > a few keys on the keyboard before the kernel loads, and I think you'll > start to get my point here. Currently on systems I am talking I have a directory structured like: dir/config dir/bzImage dir/ramdisk dir/ramdisk/sbin/init dir/ramdisk/etc/ ..... So I edit dir/ramdisk/etc/somefile.conf and run a script that rebuilds everything. Or I edit dir/config which has my command line in it and run the script again. Getting to this point took a bit of effort but that is where I am at now. With initramfs it becomes as designed it becomes easier because it easier to build a cpio archive. But mkcramfs has similar properties for building filesystems. The whole building the initramfs thing into the kernel is something that probably needs to be worked so the initramfs can be attached to the kernel separately. When the bootable kernel image is ELF that is easy. With something like bzImage on x86 it can be a pain, as there isn't any room to extend the things. And all I asserted is that for ``me'' it is equally simple to change the ramdisk contents as to changes those of a file. For something like /bin/kinit that contains the default kernel polices on how to mount root it should certainly be command line driven. For complicated setups where I am partitioning the hard drives, making filesystems, and installing over the network. A configuration file has proven to be easier, and that is what I do. The fundamental issue is that after a certain point the command line just does not have room for all of the parameters needed. Possibly I answered the wrong question? As for hitting a few keys on the keyboard in the bootloader before the kernel loads well.... That is good on one machine, it gets to be a pain on 4. And at a 1000 I have much better things to do with my time. Which just shows my bias from working on with clusters. On a cluster the only time you want to treat a machine as an individual is when you are replacing bad hardware. I have played with parsing command line options. And messing with /proc/cmdline or being /sbin/init and just getting those options from the kernel is not difficult. For prototyping it may be a good idea to read /proc/cmdline so the kernel can eat the options before kinit does. Eric From acme@conectiva.com.br Sat Mar 8 20:45:41 2003 Received: with ECARTIS (v1.0.0; list netdev); Sat, 08 Mar 2003 20:45:44 -0800 (PST) Received: from orion.netbank.com.br (orion.netbank.com.br [200.203.199.90]) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h294jbq9015616 for ; Sat, 8 Mar 2003 20:45:39 -0800 Received: from [200.181.170.60] (helo=brinquendo.conectiva.com.br) by orion.netbank.com.br with asmtp (Exim 3.33 #1) id 18rsiB-0007pl-00; Sun, 09 Mar 2003 01:46:59 -0300 Received: by brinquendo.conectiva.com.br (Postfix, from userid 500) id 4F9B01966C; Sun, 9 Mar 2003 04:46:34 +0000 (UTC) Date: Sun, 9 Mar 2003 01:46:33 -0300 From: Arnaldo Carvalho de Melo To: Alan Cox Cc: Michael Mueller , Russell King , Jeff Garzik , Robin Holt , Linux Kernel Mailing List , netdev@oss.sgi.com Subject: Re: Make ipconfig.c work as a loadable module. Message-ID: <20030309044633.GC9359@conectiva.com.br> Mail-Followup-To: Arnaldo Carvalho de Melo , Alan Cox , Michael Mueller , Russell King , Jeff Garzik , Robin Holt , Linux Kernel Mailing List , netdev@oss.sgi.com References: <1046990052.18158.121.camel@irongate.swansea.linux.org.uk> <20030306221136.GB26732@gtf.org> <20030306222546.K838@flint.arm.linux.org.uk> <1046996037.18158.142.camel@irongate.swansea.linux.org.uk> <20030306231905.M838@flint.arm.linux.org.uk> <1046996987.17718.144.camel@irongate.swansea.linux.org.uk> <200303070715.IAA27138@fire.malware.de> <1047041676.20793.12.camel@irongate.swansea.linux.org.uk> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1047041676.20793.12.camel@irongate.swansea.linux.org.uk> User-Agent: Mutt/1.4i X-Url: http://advogato.org/person/acme X-archive-position: 1917 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: acme@conectiva.com.br Precedence: bulk X-list: netdev Content-Length: 887 Lines: 21 Em Fri, Mar 07, 2003 at 12:54:36PM +0000, Alan Cox escreveu: > On Fri, 2003-03-07 at 07:15, Michael Mueller wrote: > > Hi Alan, > > Sorry, but I must join Russel here. I have atleast one machine which has > > a bootloader able to load exactly one file only. There is currently no > > way to load an initrd. It would need to implement the whole (BOOTP+)TFTP > > stuff again, just to get the initrd. So I was quite happy linux 2.4 > > still knows about mounting a NFS root filesystem without user-space > > help. > > Just glue the initrd to the kernel. This is not rocket science arch/sparc/boot/piggyback.c Simple utility to make a single-image install kernel with initial ramdisk for Sparc tftpbooting without need to set up nfs. Copyright (C) 1996 Jakub Jelinek (jj@sunsite.mff.cuni.cz) Pete Zaitcev endian fixes for cross-compiles, 2000. - Arnaldo From seong@etri.re.kr Sun Mar 9 15:54:35 2003 Received: with ECARTIS (v1.0.0; list netdev); Sun, 09 Mar 2003 15:54:39 -0800 (PST) Received: from cms1.etri.re.kr (cms1.etri.re.kr [129.254.16.11]) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h29NsXq9005371 for ; Sun, 9 Mar 2003 15:54:34 -0800 Received: from SEONG ([129.254.172.40]) by cms1.etri.re.kr with SMTP (Microsoft Exchange Internet Mail Service Version 5.5.2653.13) id GRNP2T9Q; Mon, 10 Mar 2003 08:54:13 +0900 Message-ID: <001201c2e697$6ace7280$28acfe81@seong> From: "Seong Moon" To: Subject: multicast routing cache monitoring? Date: Mon, 10 Mar 2003 08:55:49 +0900 MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 5.50.4920.2300 X-MimeOLE: Produced By Microsoft MimeOLE V5.50.4920.2300 X-archive-position: 1918 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: seong@etri.re.kr Precedence: bulk X-list: netdev Content-Length: 210 Lines: 11 Hi there! I want to get and monitor multicast routing cache information from kernel through rtnetlink. Is it possible ? I'm using linux2.4.18. If it is possible, What can I do for this ? thanks in advance. From ulrik.debie@newtec.be Mon Mar 10 07:16:52 2003 Received: with ECARTIS (v1.0.0; list netdev); Mon, 10 Mar 2003 07:16:59 -0800 (PST) Received: from mailhost.newtec.be ([62.58.98.250]) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h2AFGnq9026141 for ; Mon, 10 Mar 2003 07:16:51 -0800 Received: from Newtec_gw-MTA by mailhost.newtec.be with Novell_GroupWise; Mon, 10 Mar 2003 16:16:46 +0100 Message-Id: X-Mailer: Novell GroupWise Internet Agent 6.0.2 Date: Mon, 10 Mar 2003 16:16:13 +0100 From: "Ulrik De Bie" To: , , Subject: Fwd: tcp seq nr wrapping bug + patch Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="=_D887874E.62036C27" X-archive-position: 1919 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: ulrik.debie@newtec.be Precedence: bulk X-list: netdev Content-Length: 1554 Lines: 63 --=_D887874E.62036C27 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: quoted-printable Content-Disposition: inline Hello, I resend this patch which fixes a stupid mistake in the tcp sequence = number in the 2.2 kernel. Kind regards, Ulrik De Bie --=_D887874E.62036C27 Content-Type: message/rfc822 Date: Wed, 11 Sep 2002 17:36:27 +0200 From: "Ulrik De Bie" To: , Subject: tcp seq nr wrapping bug + patch Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: quoted-printable Content-Disposition: inline When the sequence number in a tcp session is about to wrap for packets leaving the system, a problem arises: When the system call writev is called, with a count of 5 for instance, and = the second iov entry makes the sequence number wrap, then the other 3 will be sent in separate packets, because the comparison will be wrong. before() fixes this problem. Sorry that I'm sending from a windows machine at the moment, I don't have a linux mail machine available at the very moment. Kind regards, Ulrik De Bie udb@newtec.be --- linux-2.2.21/net/ipv4/tcp.c Wed Sep 11 11:03:10 2002 +++ linux/net/ipv4/tcp.c Wed Sep 11 17:27:53 2002 @@ -823,7 +823,7 @@ */ if (skb_tailroom(skb) > 0 && (mss_now - copy) > 0 && - tp->snd_nxt < TCP_SKB_CB(skb)->end_seq)= { + before(tp->snd_nxt , TCP_SKB_CB(skb)->e= nd_seq)) { int last_byte_was_odd =3D (copy % = 4); =20 /*=20 --=_D887874E.62036C27-- From johnpol@2ka.mipt.ru Mon Mar 10 11:23:27 2003 Received: with ECARTIS (v1.0.0; list netdev); Mon, 10 Mar 2003 11:23:30 -0800 (PST) Received: from ffke-campus-gw.mipt.ru (ffke-campus-gw.mipt.ru [194.85.82.65]) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h2AJMgq9007672 for ; Mon, 10 Mar 2003 11:23:25 -0800 Received: from zanzibar.2ka.mipt.ru (zanzibar.2ka.mipt.ru [194.85.82.77]) by ffke-campus-gw.mipt.ru (8.12.8/8.12.8) with SMTP id h2AJMYBC030715 for ; Mon, 10 Mar 2003 22:22:34 +0300 Date: Mon, 10 Mar 2003 22:22:05 +0300 From: Evgeniy Polyakov To: netdev@oss.sgi.com Subject: netconsole for kernel 2.5.64 Message-Id: <20030310222205.0664b476.johnpol@2ka.mipt.ru> Reply-To: johnpol@2ka.mipt.ru Organization: MIPT X-Mailer: Sylpheed version 0.8.9 (GTK+ 1.2.10; i686-pc-linux-gnu) Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="Multipart_Mon__10_Mar_2003_22:22:05_+0300_082e02e8" X-archive-position: 1920 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: johnpol@2ka.mipt.ru Precedence: bulk X-list: netdev Content-Length: 19468 Lines: 273 This is a multi-part message in MIME format. --Multipart_Mon__10_Mar_2003_22:22:05_+0300_082e02e8 Content-Type: text/plain; charset=KOI8-R Content-Transfer-Encoding: quoted-printable X-MIME-Autoconverted: from 8bit to quoted-printable by ffke-campus-gw.mipt.ru id h2AJMYBC030715 Hello, developers. If someone still interesting in Ingo Molnar's netconsole patch against latest 2.5 tree, here it is. Now it is statically linked into the kernel to obtain even part of dmesg. It sends UDP datagrams only to broadcast address(255.255.255.255/ff:ff:ff:ff:ff:ff) from only 10.0.0.2 address. All ports are also statically assigned to 6666. It uses only the first net_device, wich is not LOOPBACK or dummy. And wich can change it's flags. But in theory anyone still can do insmod with params. In practi=D3e module part differs from the original almost only in replacement of__cli() and others by cli()... netconsole_client is the latest from http://redhat.com/~mingo/netconsole-patches/=20 Evgeniy Polyakov ( s0mbre ) --Multipart_Mon__10_Mar_2003_22:22:05_+0300_082e02e8 Content-Type: application/octet-stream; name="netconsole-2.5.64.diff" Content-Disposition: attachment; filename="netconsole-2.5.64.diff" Content-Transfer-Encoding: base64 ZGlmZiAtTnJ1IC4uLzEvbGludXgtMi41LjY0L2RyaXZlcnMvbmV0LzNjNTl4LmMgLi9kcml2ZXJz L25ldC8zYzU5eC5jCi0tLSAuLi8xL2xpbnV4LTIuNS42NC9kcml2ZXJzL25ldC8zYzU5eC5jCVdl ZCBNYXIgIDUgMDY6Mjg6NTMgMjAwMworKysgLi9kcml2ZXJzL25ldC8zYzU5eC5jCU1vbiBNYXIg MTAgMjE6NTk6MzMgMjAwMwpAQCAtODg1LDYgKzg4NSw5IEBACiBzdGF0aWMgaW50IHZvcnRleF9p b2N0bChzdHJ1Y3QgbmV0X2RldmljZSAqZGV2LCBzdHJ1Y3QgaWZyZXEgKnJxLCBpbnQgY21kKTsK IHN0YXRpYyB2b2lkIHZvcnRleF90eF90aW1lb3V0KHN0cnVjdCBuZXRfZGV2aWNlICpkZXYpOwog c3RhdGljIHZvaWQgYWNwaV9zZXRfV09MKHN0cnVjdCBuZXRfZGV2aWNlICpkZXYpOworI2lmZGVm IEhBVkVfUE9MTF9DT05UUk9MTEVSCitzdGF0aWMgdm9pZCBfX3BvbGxfY29udHJvbGxlcihzdHJ1 Y3QgbmV0X2RldmljZSAqZGV2KTsgCisjZW5kaWYKIAwKIC8qIFRoaXMgZHJpdmVyIHVzZXMgJ29w dGlvbnMnIHRvIHBhc3MgdGhlIG1lZGlhIHR5cGUsIGZ1bGwtZHVwbGV4IGZsYWcsIGV0Yy4gKi8K IC8qIE9wdGlvbiBjb3VudCBsaW1pdCBvbmx5IC0tIHVubGltaXRlZCBpbnRlcmZhY2VzIGFyZSBz dXBwb3J0ZWQuICovCkBAIC05MzYsNiArOTM5LDIwIEBACiAKICNlbmRpZiAvKiBDT05GSUdfUE0g Ki8KIAorI2lmZGVmIEhBVkVfUE9MTF9DT05UUk9MTEVSCitzdGF0aWMgdm9pZCBfX3BvbGxfY29u dHJvbGxlcihzdHJ1Y3QgbmV0X2RldmljZSAqZGV2KQoreworCXN0cnVjdCB2b3J0ZXhfcHJpdmF0 ZSAqdnAgPSBkZXYtPnByaXY7CisJCisJZGlzYWJsZV9pcnEoZGV2LT5pcnEpOworCWlmICh2cC0+ ZnVsbF9idXNfbWFzdGVyX3J4KQorCQlib29tZXJhbmdfaW50ZXJydXB0KGRldi0+aXJxLCBkZXYs IE5VTEwpOworCWVsc2UKKwkJdm9ydGV4X2ludGVycnVwdChkZXYtPmlycSwgZGV2LCBOVUxMKTsK KwllbmFibGVfaXJxKGRldi0+aXJxKTsKK30KKyNlbmRpZgorCiAjaWZkZWYgQ09ORklHX0VJU0EK IHN0YXRpYyBzdHJ1Y3QgZWlzYV9kZXZpY2VfaWQgdm9ydGV4X2Vpc2FfaWRzW10gPSB7CiAJeyAi VENNNTkyMCIgfSwKQEAgLTE0MzgsNiArMTQ1NSw5IEBACiAJCQkJKGRldi0+ZmVhdHVyZXMgJiBO RVRJRl9GX0lQX0NTVU0pID8gImVuIjoiZGlzIik7CiAJfQogCisjaWZkZWYgSEFWRV9QT0xMX0NP TlRST0xMRVIKKwlkZXYtPnBvbGxfY29udHJvbGxlciA9IF9fcG9sbF9jb250cm9sbGVyOworI2Vu ZGlmCiAJZGV2LT5zdG9wID0gdm9ydGV4X2Nsb3NlOwogCWRldi0+Z2V0X3N0YXRzID0gdm9ydGV4 X2dldF9zdGF0czsKIAlkZXYtPmRvX2lvY3RsID0gdm9ydGV4X2lvY3RsOwpCaW5hcnkgZmlsZXMg Li4vMS9saW51eC0yLjUuNjQvZHJpdmVycy9uZXQvM2M1OXgubyBhbmQgLi9kcml2ZXJzL25ldC8z YzU5eC5vIGRpZmZlcgpkaWZmIC1OcnUgLi4vMS9saW51eC0yLjUuNjQvZHJpdmVycy9uZXQvS2Nv bmZpZyAuL2RyaXZlcnMvbmV0L0tjb25maWcKLS0tIC4uLzEvbGludXgtMi41LjY0L2RyaXZlcnMv bmV0L0tjb25maWcJV2VkIE1hciAgNSAwNjoyOTozNCAyMDAzCisrKyAuL2RyaXZlcnMvbmV0L0tj b25maWcJTW9uIE1hciAxMCAyMTo1NzozMCAyMDAzCkBAIC0zOSw2ICszOSwyOCBAQAogCXNvdXJj ZSAiZHJpdmVycy9uZXQvYXJjbmV0L0tjb25maWciCiBlbmRpZgogCitjb25maWcgTkVUQ09OU09M RQorCXRyaXN0YXRlICJOZXR3b3JrIGNvbnNvbGUgc3VwcG9ydCIKKwlkZXBlbmRzIG9uIE5FVERF VklDRVMKKwktLS1oZWxwLS0tCisJTmV0d29yayBjb25zb2xlIGlzIGEgZGVidWdnaW5nIHRvb2wg dGhhdCBpbXBsZW1lbnRzIAorCWtlcm5lbC1sZXZlbCBuZXR3b3JrIGxvZ2dpbmcgdmlhIFVEUCBw YWNrZXRzLgorCisJdGhlIHNwZWNpYWwgdGhpbmcgYWJvdXQgdGhpcyBhcHByb2FjaCBpcyB0aGUg YWJpbGl0eSB0byBzZW5kICdlbWVyZ2VuY3knCisJbmV0d29yayBwYWNrZXRzIGV2ZW4gZnJvbSBJ UlEgaGFuZGxlcnMuIFRoaXMgZW5hYmxlcyB0aGUgbmV0Y29uc29sZSB0bworCXNlbmQgZW5vdWdo IGluZm8gZXZlbiBpZiB3ZSBjcmFzaCBpbiBpbml0IG9yIGluIGFuIGludGVycnVwdCBoYW5kbGVy LgorCisJYW5vdGhlciBwcm9wZXJ0eSBvZiBuZXRjb25zb2xlIGlzIHRoYXQgaXQncyBhYmxlIHRv IHNoYXJlIHRoZSBuZXR3b3JraW5nCisJZGV2aWNlIHdpdGggb3RoZXIga2VybmVsIHN1YnN5c3Rl bXMsIGxpa2UgdGhlIFRDUC9JUCBzdGFjay4gU28gdGhlCisJbmV0d29ya2luZyBkZXZpY2UgaXMg bm90IGRlZGljYXRlZCBmb3IgbmV0Y29uc29sZSB1c2UsIGl0J3MgdHJhbnNwYXJlbnRseQorCXNo YXJlZC4KKworCW5ldGNvbnNvbGUgaXMgYWxzbyBkZXNpZ25lZCB0byBiZSByb2J1c3QsIGl0IGdv ZXMgc3RyYWlnaHQgdG8gdGhlIG5ldHdvcmsKKwlkcml2ZXIsIHNvIGl0IGRvZXMgbm90IGRlcGVu ZCBvbiB0aGUgbmV0d29ya2luZyBzdGFjayB0byBsb2cgbWVzc2FnZXMuIAorCQorCWh0dHA6Ly9t YXJjLnRoZWFpbXNncm91cC5jb20vP2w9bGludXgta2VybmVsJm09MTAwMTUzNTE1MTI2OTEwJnc9 MgorCWh0dHA6Ly9yZWRoYXQuY29tL35taW5nby9uZXRjb25zb2xlLXBhdGNoZXMvCisKIGNvbmZp ZyBEVU1NWQogCXRyaXN0YXRlICJEdW1teSBuZXQgZHJpdmVyIHN1cHBvcnQiCiAJZGVwZW5kcyBv biBORVRERVZJQ0VTCmRpZmYgLU5ydSAuLi8xL2xpbnV4LTIuNS42NC9kcml2ZXJzL25ldC9NYWtl ZmlsZSAuL2RyaXZlcnMvbmV0L01ha2VmaWxlCi0tLSAuLi8xL2xpbnV4LTIuNS42NC9kcml2ZXJz L25ldC9NYWtlZmlsZQlXZWQgTWFyICA1IDA2OjI5OjA0IDIwMDMKKysrIC4vZHJpdmVycy9uZXQv TWFrZWZpbGUJTW9uIE1hciAxMCAyMTo1MDoxMyAyMDAzCkBAIC0xODksNSArMTg5LDcgQEAKIG9i ai0kKENPTkZJR19IQU1SQURJTykgKz0gaGFtcmFkaW8vCiBvYmotJChDT05GSUdfSVJEQSkgKz0g aXJkYS8KIAorb2JqLSQoQ09ORklHX05FVENPTlNPTEUpICs9IG5ldGNvbnNvbGUubworCiAKIGlu Y2x1ZGUgJChUT1BESVIpL2RyaXZlcnMvdXNiL25ldC9NYWtlZmlsZS5taWkKZGlmZiAtTnJ1IC4u LzEvbGludXgtMi41LjY0L2RyaXZlcnMvbmV0L25ldGNvbnNvbGUuYyAuL2RyaXZlcnMvbmV0L25l dGNvbnNvbGUuYwotLS0gLi4vMS9saW51eC0yLjUuNjQvZHJpdmVycy9uZXQvbmV0Y29uc29sZS5j CVRodSBKYW4gIDEgMDM6MDA6MDAgMTk3MAorKysgLi9kcml2ZXJzL25ldC9uZXRjb25zb2xlLmMJ TW9uIE1hciAxMCAyMjowMDo1MSAyMDAzCkBAIC0wLDAgKzEsMzcwIEBACisvKgorICogIGxpbnV4 L2RyaXZlcnMvbmV0L25ldGNvbnNvbGUuYworICoKKyAqICBDb3B5cmlnaHQgKEMpIDIwMDEgIElu Z28gTW9sbmFyIDxtaW5nb0ByZWRoYXQuY29tPgorICoKKyAqICBUaGlzIGZpbGUgY29udGFpbnMg dGhlIGltcGxlbWVudGF0aW9uIG9mIGFuIElSUS1zYWZlLCBjcmFzaC1zYWZlCisgKiAga2VybmVs IGNvbnNvbGUgaW1wbGVtZW50YXRpb24gdGhhdCBvdXRwdXRzIGtlcm5lbCBtZXNzYWdlcyB0byB0 aGUKKyAqICBuZXR3b3JrLgorICoKKyAqIE1vZGlmaWNhdGlvbiBoaXN0b3J5OgorICoKKyAqIDIw MDEtMDktMTcgICAgc3RhcnRlZCBieSBJbmdvIE1vbG5hci4KKyAqIDIwMDMtMDMtMTAJICBwb3J0 ZWQgdG8gMi41IGFuZCBsaW5rZWQgaW50byB0aGUga2VybmVsCisgKgkJCWJ5IEV2Z2VuaXkgUG9s eWFrb3YgPGpvaG5wb2xAMmthLm1pcHQucnU+CisgKi8KKworLyoqKioqKioqKioqKioqKioqKioq KioqKioqKioqKioqKioqKioqKioqKioqKioqKioqKioqKioqKioqKioqKioKKyAqICAgICAgVGhp cyBwcm9ncmFtIGlzIGZyZWUgc29mdHdhcmU7IHlvdSBjYW4gcmVkaXN0cmlidXRlIGl0IGFuZC9v ciBtb2RpZnkKKyAqICAgICAgaXQgdW5kZXIgdGhlIHRlcm1zIG9mIHRoZSBHTlUgR2VuZXJhbCBQ dWJsaWMgTGljZW5zZSBhcyBwdWJsaXNoZWQgYnkKKyAqICAgICAgdGhlIEZyZWUgU29mdHdhcmUg Rm91bmRhdGlvbjsgZWl0aGVyIHZlcnNpb24gMiwgb3IgKGF0IHlvdXIgb3B0aW9uKQorICogICAg ICBhbnkgbGF0ZXIgdmVyc2lvbi4KKyAqCisgKiAgICAgIFRoaXMgcHJvZ3JhbSBpcyBkaXN0cmli dXRlZCBpbiB0aGUgaG9wZSB0aGF0IGl0IHdpbGwgYmUgdXNlZnVsLAorICogICAgICBidXQgV0lU SE9VVCBBTlkgV0FSUkFOVFk7IHdpdGhvdXQgZXZlbiB0aGUgaW1wbGllZCB3YXJyYW50eSBvZgor ICogICAgICBNRVJDSEFOVEFCSUxJVFkgb3IgRklUTkVTUyBGT1IgQSBQQVJUSUNVTEFSIFBVUlBP U0UuICBTZWUgdGhlCisgKiAgICAgIEdOVSBHZW5lcmFsIFB1YmxpYyBMaWNlbnNlIGZvciBtb3Jl IGRldGFpbHMuCisgKgorICogICAgICBZb3Ugc2hvdWxkIGhhdmUgcmVjZWl2ZWQgYSBjb3B5IG9m IHRoZSBHTlUgR2VuZXJhbCBQdWJsaWMgTGljZW5zZQorICogICAgICBhbG9uZyB3aXRoIHRoaXMg cHJvZ3JhbTsgaWYgbm90LCB3cml0ZSB0byB0aGUgRnJlZSBTb2Z0d2FyZQorICogICAgICBGb3Vu ZGF0aW9uLCBJbmMuLCA2NzUgTWFzcyBBdmUsIENhbWJyaWRnZSwgTUEgMDIxMzksIFVTQS4KKyAq CisgKioqKioqKioqKioqKioqKioqKioqKioqKioqKioqKioqKioqKioqKioqKioqKioqKioqKioq KioqKioqKioqKi8KKworI2luY2x1ZGUgPG5ldC90Y3AuaD4KKyNpbmNsdWRlIDxuZXQvdWRwLmg+ CisjaW5jbHVkZSA8bGludXgvbW0uaD4KKyNpbmNsdWRlIDxsaW51eC90dHkuaD4KKyNpbmNsdWRl IDxsaW51eC9pbml0Lmg+CisjaW5jbHVkZSA8bGludXgvbW9kdWxlLmg+CisjaW5jbHVkZSA8YXNt L3VuYWxpZ25lZC5oPgorI2luY2x1ZGUgPGxpbnV4L2NvbnNvbGUuaD4KKyNpbmNsdWRlIDxsaW51 eC9pbmV0Lmg+CisjaW5jbHVkZSA8bGludXgvc21wX2xvY2suaD4KKyNpbmNsdWRlIDxsaW51eC9u ZXRkZXZpY2UuaD4KKyNpbmNsdWRlIDxsaW51eC90dHlfZHJpdmVyLmg+CisjaW5jbHVkZSA8bGlu dXgvZXRoZXJkZXZpY2UuaD4KKworI2RlZmluZSBERUZQT1JUIDY2NjYKKyNkZWZpbmUgREVGQURE UiBJTkFERFJfQlJPQURDQVNUCisKK3N0YXRpYyBzdHJ1Y3QgbmV0X2RldmljZSAqbmV0Y29uc29s ZV9kZXY7CitzdGF0aWMgdTE2IHNvdXJjZV9wb3J0ID0gREVGUE9SVCwgdGFyZ2V0X3BvcnQgPSBE RUZQT1JUOworc3RhdGljIHUzMiBzb3VyY2VfaXAsIHRhcmdldF9pcCA9IERFRkFERFI7CitzdGF0 aWMgdW5zaWduZWQgY2hhciBkYWRkcls2XSA9IHsweGZmLCAweGZmLCAweGZmLCAweGZmLCAweGZm LCAweGZmfSA7CisKKyNkZWZpbmUgTkVUQ09OU09MRV9WRVJTSU9OIDB4MDEKKyNkZWZpbmUgSEVB REVSX0xFTiA1CisKKyNkZWZpbmUgTUFYX1VEUF9DSFVOSyAxNDYwCisjZGVmaW5lIE1BWF9QUklO VF9DSFVOSyAoTUFYX1VEUF9DSFVOSy1IRUFERVJfTEVOKQorCisvKgorICogV2UgbWFpbnRhaW4g YSBzbWFsbCBwb29sIG9mIGZ1bGx5LXNpemVkIHNrYnMsCisgKiB0byBtYWtlIHN1cmUgdGhlIG1l c3NhZ2UgZ2V0cyBvdXQgZXZlbiBpbgorICogZXh0cmVtZSBPT00gc2l0dWF0aW9ucy4KKyAqLwor I2RlZmluZSBNQVhfTkVUQ09OU09MRV9TS0JTIDMyCisKK3N0YXRpYyBzcGlubG9ja190IG5ldGNv bnNvbGVfbG9jayA9IFNQSU5fTE9DS19VTkxPQ0tFRDsKK3N0YXRpYyBpbnQgbnJfbmV0Y29uc29s ZV9za2JzOworc3RhdGljIHN0cnVjdCBza19idWZmICpuZXRjb25zb2xlX3NrYnM7CisKKyNkZWZp bmUgTUFYX1NLQl9TSVpFIFwKKwkJKE1BWF9VRFBfQ0hVTksgKyBzaXplb2Yoc3RydWN0IHVkcGhk cikgKyBcCisJCQkJc2l6ZW9mKHN0cnVjdCBpcGhkcikgKyBzaXplb2Yoc3RydWN0IGV0aGhkcikp CisKK3N0YXRpYyB2b2lkIF9fcmVmaWxsX25ldGNvbnNvbGVfc2ticyh2b2lkKQoreworCXN0cnVj dCBza19idWZmICpza2I7CisJdW5zaWduZWQgbG9uZyBmbGFnczsKKworCXNwaW5fbG9ja19pcnFz YXZlKCZuZXRjb25zb2xlX2xvY2ssIGZsYWdzKTsKKwl3aGlsZSAobnJfbmV0Y29uc29sZV9za2Jz IDwgTUFYX05FVENPTlNPTEVfU0tCUykgeworCQlza2IgPSBhbGxvY19za2IoTUFYX1NLQl9TSVpF LCBHRlBfQVRPTUlDKTsKKwkJaWYgKCFza2IpCisJCQlicmVhazsKKwkJaWYgKG5ldGNvbnNvbGVf c2ticykKKwkJCXNrYi0+bmV4dCA9IG5ldGNvbnNvbGVfc2ticzsKKwkJZWxzZQorCQkJc2tiLT5u ZXh0ID0gTlVMTDsKKwkJbmV0Y29uc29sZV9za2JzID0gc2tiOworCQlucl9uZXRjb25zb2xlX3Nr YnMrKzsKKwl9CisJc3Bpbl91bmxvY2tfaXJxcmVzdG9yZSgmbmV0Y29uc29sZV9sb2NrLCBmbGFn cyk7Cit9CisKK3N0YXRpYyBzdHJ1Y3Qgc2tfYnVmZiAqIGdldF9uZXRjb25zb2xlX3NrYih2b2lk KQoreworCXN0cnVjdCBza19idWZmICpza2I7CisKKwl1bnNpZ25lZCBsb25nIGZsYWdzOworCisJ c3Bpbl9sb2NrX2lycXNhdmUoJm5ldGNvbnNvbGVfbG9jaywgZmxhZ3MpOworCXNrYiA9IG5ldGNv bnNvbGVfc2ticzsKKwlpZiAoc2tiKQorCQluZXRjb25zb2xlX3NrYnMgPSBza2ItPm5leHQ7CisJ c2tiLT5uZXh0ID0gTlVMTDsKKwlucl9uZXRjb25zb2xlX3NrYnMtLTsKKwlzcGluX3VubG9ja19p cnFyZXN0b3JlKCZuZXRjb25zb2xlX2xvY2ssIGZsYWdzKTsKKworCXJldHVybiBza2I7Cit9CisK K3N0YXRpYyBzcGlubG9ja190IHNlcXVlbmNlX2xvY2sgPSBTUElOX0xPQ0tfVU5MT0NLRUQ7Citz dGF0aWMgdW5zaWduZWQgaW50IG9mZnNldDsKKworc3RhdGljIHZvaWQgc2VuZF9uZXRjb25zb2xl X3NrYihzdHJ1Y3QgbmV0X2RldmljZSAqZGV2LCBjb25zdCBjaGFyICptc2csIHVuc2lnbmVkIGlu dCBtc2dfbGVuKQoreworCWludCB0b3RhbF9sZW4sIGV0aF9sZW4sIGlwX2xlbiwgdWRwX2xlbjsK Kwl1bnNpZ25lZCBsb25nIGZsYWdzOworCXN0cnVjdCBza19idWZmICpza2I7CisJc3RydWN0IHVk cGhkciAqdWRwaDsKKwlzdHJ1Y3QgaXBoZHIgKmlwaDsKKwlzdHJ1Y3QgZXRoaGRyICpldGg7CisK Kwl1ZHBfbGVuID0gbXNnX2xlbiArIEhFQURFUl9MRU4gKyBzaXplb2YoKnVkcGgpOworCWlwX2xl biA9IGV0aF9sZW4gPSB1ZHBfbGVuICsgc2l6ZW9mKCppcGgpOworCXRvdGFsX2xlbiA9IGV0aF9s ZW4gKyBFVEhfSExFTjsKKworCWlmIChucl9uZXRjb25zb2xlX3NrYnMgPCBNQVhfTkVUQ09OU09M RV9TS0JTKQorCQlfX3JlZmlsbF9uZXRjb25zb2xlX3NrYnMoKTsKKworCXNrYiA9IGFsbG9jX3Nr Yih0b3RhbF9sZW4sIEdGUF9BVE9NSUMpOworCWlmICghc2tiKSB7CisJCXNrYiA9IGdldF9uZXRj b25zb2xlX3NrYigpOworCQlpZiAoIXNrYikKKwkJCS8qIHRvdWdoISAqLworCQkJcmV0dXJuOwor CX0KKworCWF0b21pY19zZXQoJnNrYi0+dXNlcnMsIDEpOworCXNrYl9yZXNlcnZlKHNrYiwgdG90 YWxfbGVuIC0gbXNnX2xlbiAtIEhFQURFUl9MRU4pOworCXNrYi0+ZGF0YVswXSA9IE5FVENPTlNP TEVfVkVSU0lPTjsKKworCXNwaW5fbG9ja19pcnFzYXZlKCZzZXF1ZW5jZV9sb2NrLCBmbGFncyk7 CisJcHV0X3VuYWxpZ25lZChodG9ubChvZmZzZXQpLCAodTMyICopIChza2ItPmRhdGEgKyAxKSk7 CisJb2Zmc2V0ICs9IG1zZ19sZW47CisJc3Bpbl91bmxvY2tfaXJxcmVzdG9yZSgmc2VxdWVuY2Vf bG9jaywgZmxhZ3MpOworCisJbWVtY3B5KHNrYi0+ZGF0YSArIEhFQURFUl9MRU4sIG1zZywgbXNn X2xlbik7CisJc2tiLT5sZW4gKz0gbXNnX2xlbiArIEhFQURFUl9MRU47CisKKwl1ZHBoID0gKHN0 cnVjdCB1ZHBoZHIgKikgc2tiX3B1c2goc2tiLCBzaXplb2YoKnVkcGgpKTsKKwl1ZHBoLT5zb3Vy Y2UgPSBzb3VyY2VfcG9ydDsKKwl1ZHBoLT5kZXN0ID0gdGFyZ2V0X3BvcnQ7CisJdWRwaC0+bGVu ID0gaHRvbnModWRwX2xlbik7CisJdWRwaC0+Y2hlY2sgPSAwOworCisJaXBoID0gKHN0cnVjdCBp cGhkciAqKXNrYl9wdXNoKHNrYiwgc2l6ZW9mKCppcGgpKTsKKworCWlwaC0+dmVyc2lvbiAgPSA0 OworCWlwaC0+aWhsICAgICAgPSA1OworCWlwaC0+dG9zICAgICAgPSAwOworICAgICAgICBpcGgt PnRvdF9sZW4gID0gaHRvbnMoaXBfbGVuKTsKKwlpcGgtPmlkICAgICAgID0gMDsKKwlpcGgtPmZy YWdfb2ZmID0gMDsKKwlpcGgtPnR0bCAgICAgID0gNjQ7CisgICAgICAgIGlwaC0+cHJvdG9jb2wg PSBJUFBST1RPX1VEUDsKKwlpcGgtPmNoZWNrICAgID0gMDsKKyAgICAgICAgaXBoLT5zYWRkciAg ICA9IHNvdXJjZV9pcDsKKyAgICAgICAgaXBoLT5kYWRkciAgICA9IHRhcmdldF9pcDsKKwlpcGgt PmNoZWNrICAgID0gaXBfZmFzdF9jc3VtKCh1bnNpZ25lZCBjaGFyICopaXBoLCBpcGgtPmlobCk7 CisKKwlldGggPSAoc3RydWN0IGV0aGhkciAqKSBza2JfcHVzaChza2IsIEVUSF9ITEVOKTsKKwor CWV0aC0+aF9wcm90byA9IGh0b25zKEVUSF9QX0lQKTsKKwltZW1jcHkoZXRoLT5oX3NvdXJjZSwg ZGV2LT5kZXZfYWRkciwgZGV2LT5hZGRyX2xlbik7CisJbWVtY3B5KGV0aC0+aF9kZXN0LCBkYWRk ciwgZGV2LT5hZGRyX2xlbik7CisKK3JlcGVhdDoKKwlzcGluX2xvY2soJmRldi0+eG1pdF9sb2Nr KTsKKwlkZXYtPnhtaXRfbG9ja19vd25lciA9IHNtcF9wcm9jZXNzb3JfaWQoKTsKKworCWlmIChu ZXRpZl9xdWV1ZV9zdG9wcGVkKGRldikpIHsKKwkJZGV2LT54bWl0X2xvY2tfb3duZXIgPSAtMTsK KwkJc3Bpbl91bmxvY2soJmRldi0+eG1pdF9sb2NrKTsKKworCQlkZXYtPnBvbGxfY29udHJvbGxl cihkZXYpOworCQlnb3RvIHJlcGVhdDsKKwl9CisKKwlkZXYtPmhhcmRfc3RhcnRfeG1pdChza2Is IGRldik7CisKKwlkZXYtPnhtaXRfbG9ja19vd25lciA9IC0xOworCXNwaW5fdW5sb2NrKCZkZXYt PnhtaXRfbG9jayk7Cit9CisKK3N0YXRpYyB2b2lkIHdyaXRlX25ldGNvbnNvbGVfbXNnKHN0cnVj dCBjb25zb2xlICpjb24sIGNvbnN0IGNoYXIgKm1zZywgdW5zaWduZWQgaW50IG1zZ19sZW4pCit7 CisJaW50IGxlbiwgbGVmdDsKKwlzdHJ1Y3QgbmV0X2RldmljZSAqZGV2OworCisJZGV2ID0gbmV0 Y29uc29sZV9kZXY7CisJaWYgKCFkZXYpCisJCXJldHVybjsKKworCWlmIChkZXYtPnBvbGxfY29u dHJvbGxlciAmJiBuZXRpZl9ydW5uaW5nKGRldikpIHsKKwkJdW5zaWduZWQgbG9uZyBmbGFnczsK KworCQlzYXZlX2ZsYWdzKGZsYWdzKTsKKwkJY2xpKCk7CisJCWxlZnQgPSBtc2dfbGVuOworcmVw ZWF0OgorCQlpZiAobGVmdCA+IE1BWF9QUklOVF9DSFVOSykKKwkJCWxlbiA9IE1BWF9QUklOVF9D SFVOSzsKKwkJZWxzZQorCQkJbGVuID0gbGVmdDsKKwkJc2VuZF9uZXRjb25zb2xlX3NrYihkZXYs IG1zZywgbGVuKTsKKwkJbXNnICs9IGxlbjsKKwkJbGVmdCAtPSBsZW47CisJCWlmIChsZWZ0KQor CQkJZ290byByZXBlYXQ7CisJCXJlc3RvcmVfZmxhZ3MoZmxhZ3MpOworCX0KK30KKworc3RhdGlj IGNoYXIgKmRldjsKK3N0YXRpYyBpbnQgdGFyZ2V0X2V0aF9ieXRlMCA9IDI1NTsKK3N0YXRpYyBp bnQgdGFyZ2V0X2V0aF9ieXRlMSA9IDI1NTsKK3N0YXRpYyBpbnQgdGFyZ2V0X2V0aF9ieXRlMiA9 IDI1NTsKK3N0YXRpYyBpbnQgdGFyZ2V0X2V0aF9ieXRlMyA9IDI1NTsKK3N0YXRpYyBpbnQgdGFy Z2V0X2V0aF9ieXRlNCA9IDI1NTsKK3N0YXRpYyBpbnQgdGFyZ2V0X2V0aF9ieXRlNSA9IDI1NTsK KworI2lmZGVmIE1PRFVMRQorTU9EVUxFX1BBUk0odGFyZ2V0X2lwLCAiaSIpOworTU9EVUxFX1BB Uk0odGFyZ2V0X2V0aF9ieXRlMCwgImkiKTsKK01PRFVMRV9QQVJNKHRhcmdldF9ldGhfYnl0ZTEs ICJpIik7CitNT0RVTEVfUEFSTSh0YXJnZXRfZXRoX2J5dGUyLCAiaSIpOworTU9EVUxFX1BBUk0o dGFyZ2V0X2V0aF9ieXRlMywgImkiKTsKK01PRFVMRV9QQVJNKHRhcmdldF9ldGhfYnl0ZTQsICJp Iik7CitNT0RVTEVfUEFSTSh0YXJnZXRfZXRoX2J5dGU1LCAiaSIpOworTU9EVUxFX1BBUk0oc291 cmNlX3BvcnQsICJoIik7CitNT0RVTEVfUEFSTSh0YXJnZXRfcG9ydCwgImgiKTsKK01PRFVMRV9Q QVJNKGRldiwgInMiKTsKKyNlbmRpZgorCitzdGF0aWMgc3RydWN0IGNvbnNvbGUgbmV0Y29uc29s ZSA9CisJIHsgZmxhZ3M6IENPTl9FTkFCTEVELCB3cml0ZTogd3JpdGVfbmV0Y29uc29sZV9tc2cg fTsKKworI2lmIDAKK3N0YXRpYyBpbnQgbmNfcmVjdl9wYWNrZXQoc3RydWN0IHNrX2J1ZmYgKnNr Yiwgc3RydWN0IG5ldF9kZXZpY2UgKmRldiwgc3RydWN0IHBhY2tldF90eXBlICpwdCk7CitzdGF0 aWMgc3RydWN0IHBhY2tldF90eXBlIG5jX3BhY2tldF90eXBlIF9faW5pdGRhdGEgPSB7CisJLnR5 cGUgPSBfX2NvbnN0YW50X2h0b25zKEVUSF9QX0lQKSwKKwkuZnVuYyA9IG5jX3JlY3ZfcGFja2V0 LAorfTsKKworc3RhdGljIHZvaWQgbmNfc2xlZXAoaW50IHNlYykKK3sKKwl1bnNpZ25lZCBsb25n IGppZmY7CisJCisJamlmZiA9IGppZmZpZXMgKyA1KkhaOworCXdoaWxlICh0aW1lX2JlZm9yZShq aWZmaWVzLCBqaWZmKSkKKwkJOworCit9CitzdGF0aWMgaW50IG5jX3JlY3ZfcGFja2V0KHN0cnVj dCBza19idWZmICpza2IsIHN0cnVjdCBuZXRfZGV2aWNlICpkZXYsIHN0cnVjdCBwYWNrZXRfdHlw ZSAqcHQpCit7CisJcmV0dXJuIDA7Cit9CisjZW5kaWYKK3N0YXRpYyBzdHJ1Y3QgbmV0X2Rldmlj ZSAqbmNfaW5pdF9uZXRjb25zb2xlKCkKK3sKKwlzdHJ1Y3QgbmV0X2RldmljZSAqX2RldjsKKwlp bnQgY291bnQgPSAwOworCisJcnRubF9zaGxvY2soKTsKKwlmb3IgKF9kZXY9ZGV2X2Jhc2U7IF9k ZXY7IF9kZXYgPSBfZGV2LT5uZXh0KQorCXsKKwkJcHJpbnRrKEtFUk5fSU5GTyAiJXM6IHByb2Jp bmcgZGV2aWNlIDwlcz5cbiIsIF9fZnVuY19fLCBfZGV2LT5uYW1lKTsKKwkJCisJCWlmICghc3Ry bmNtcChfZGV2LT5uYW1lLCAiZHVtbXkiLCA1KSB8fCAoX2Rldi0+ZmxhZ3MmSUZGX0xPT1BCQUNL KSkKKwkJCWNvbnRpbnVlOworCQlpZiAoZGV2X2NoYW5nZV9mbGFncyhfZGV2LCBfZGV2LT5mbGFn cyB8IElGRl9VUCkgPCAwKSB7CisJCQlwcmludGsoS0VSTl9FUlIgIiVzOiBmYWlsZWQgdG8gb3Bl biAlc1xuIiwgX19mdW5jX18sIF9kZXYtPm5hbWUpOworCQkJY29udGludWU7CisJCX0KKwkJY291 bnQrKzsKKwkJYnJlYWs7CisJfQorCXJ0bmxfc2h1bmxvY2soKTsKKworCWlmIChjb3VudCA9PSAw KQorCQlyZXR1cm4gTlVMTDsKKworCWRldiA9IF9kZXYtPm5hbWU7CisJCisJLy9kZXZfYWRkX3Bh Y2soJm5jX3BhY2tldF90eXBlKTsKKwkKKwlyZXR1cm4gX2RldjsKK30KKworc3RhdGljIGludCBf X2luaXQgaW5pdF9uZXRjb25zb2xlKHZvaWQpCit7CisJc3RydWN0IG5ldF9kZXZpY2UgKm5kZXYg PSBOVUxMOworCisJbmRldiA9IG5jX2luaXRfbmV0Y29uc29sZSgpOworCWlmICghbmRldikgewor CQlwcmludGsoS0VSTl9FUlIgIm5ldGNvbnNvbGU6IG5ldHdvcmsgZGV2aWNlICVzIGRvZXMgbm90 IGV4aXN0LCBhYm9ydGluZy5cbiIsIGRldik7CisJCXJldHVybiAtMTsKKwl9CisKKwlwcmludGso S0VSTl9JTkZPICIlczogVXNpbmcgZGV2aWNlIDwlcz5cbiIsIF9fZnVuY19fLCBuZGV2LT5uYW1l KTsKKwkKKwlpZiAoIW5kZXYtPnBvbGxfY29udHJvbGxlcikgeworCQlwcmludGsoS0VSTl9FUlIg Im5ldGNvbnNvbGU6ICVzJ3MgbmV0d29yayBkcml2ZXIgZG9lcyBub3QgaW1wbGVtZW50IG5ldGxv Z2dpbmcgeWV0LCBhYm9ydGluZy5cbiIsIGRldik7CisJCXJldHVybiAtMTsKKwl9CisKKwlyZWdp c3Rlcl9jb25zb2xlKCZuZXRjb25zb2xlKTsKKworCXNvdXJjZV9pcCA9IG50b2hsKGluX2F0b24o IjEwLjAuMC4yIikpOworI2RlZmluZSBJUCh4KSAoKGNoYXIgKikmc291cmNlX2lwKVt4XQorCXBy aW50ayhLRVJOX0lORk8gIm5ldGNvbnNvbGU6IHVzaW5nIHNvdXJjZSBJUCAlaS4laS4laS4laVxu IiwKKwkJSVAoMyksIElQKDIpLCBJUCgxKSwgSVAoMCkpOworI3VuZGVmIElQCisJc291cmNlX2lw ID0gaHRvbmwoc291cmNlX2lwKTsKKyNkZWZpbmUgSVAoeCkgKChjaGFyICopJnRhcmdldF9pcClb eF0KKwlwcmludGsoS0VSTl9JTkZPICJuZXRjb25zb2xlOiB1c2luZyB0YXJnZXQgSVAgJWkuJWku JWkuJWlcbiIsCisJCUlQKDMpLCBJUCgyKSwgSVAoMSksIElQKDApKTsKKyN1bmRlZiBJUAorCXRh cmdldF9pcCA9IGh0b25sKHRhcmdldF9pcCk7CisJcHJpbnRrKEtFUk5fSU5GTyAibmV0Y29uc29s ZTogdXNpbmcgc291cmNlIFVEUCBwb3J0OiAlaVxuIiwgc291cmNlX3BvcnQpOworCXNvdXJjZV9w b3J0ID0gaHRvbnMoc291cmNlX3BvcnQpOworCXByaW50ayhLRVJOX0lORk8gIm5ldGNvbnNvbGU6 IHVzaW5nIHRhcmdldCBVRFAgcG9ydDogJWlcbiIsIHRhcmdldF9wb3J0KTsKKwl0YXJnZXRfcG9y dCA9IGh0b25zKHRhcmdldF9wb3J0KTsKKworCWRhZGRyWzBdID0gdGFyZ2V0X2V0aF9ieXRlMDsK KwlkYWRkclsxXSA9IHRhcmdldF9ldGhfYnl0ZTE7CisJZGFkZHJbMl0gPSB0YXJnZXRfZXRoX2J5 dGUyOworCWRhZGRyWzNdID0gdGFyZ2V0X2V0aF9ieXRlMzsKKwlkYWRkcls0XSA9IHRhcmdldF9l dGhfYnl0ZTQ7CisJZGFkZHJbNV0gPSB0YXJnZXRfZXRoX2J5dGU1OworCisJaWYgKChkYWRkclsw XSAmIGRhZGRyWzFdICYgZGFkZHJbMl0gJiBkYWRkclszXSAmIGRhZGRyWzRdICYgZGFkZHJbNV0p ID09IDI1NSkKKwkJcHJpbnRrKEtFUk5fSU5GTyAibmV0Y29uc29sZTogdXNpbmcgYnJvYWRjYXN0 IGV0aGVybmV0IGZyYW1lcyB0byBzZW5kIHBhY2tldHMuXG4iKTsKKwllbHNlCisJCXByaW50ayhL RVJOX0lORk8gIm5ldGNvbnNvbGU6IHVzaW5nIHRhcmdldCBldGhlcm5ldCBhZGRyZXNzICUwMng6 JTAyeDolMDJ4OiUwMng6JTAyeDolMDJ4LlxuIiwgZGFkZHJbMF0sIGRhZGRyWzFdLCBkYWRkclsy XSwgZGFkZHJbM10sIGRhZGRyWzRdLCBkYWRkcls1XSk7CisJCQorCW5ldGNvbnNvbGVfZGV2ID0g bmRldjsKKyNkZWZpbmUgU1RBUlRVUF9NU0cgIlsuLi5uZXR3b3JrIGNvbnNvbGUgc3RhcnR1cC4u Ll1cbiIKKwl3cml0ZV9uZXRjb25zb2xlX21zZyhOVUxMLCBTVEFSVFVQX01TRywgc3RybGVuKFNU QVJUVVBfTVNHKSk7CisKKwlwcmludGsoS0VSTl9JTkZPICJuZXRjb25zb2xlOiBuZXR3b3JrIGxv Z2dpbmcgc3RhcnRlZCB1cCBzdWNjZXNzZnVsbHkhXG4iKTsKKwlyZXR1cm4gMDsKK30KKworc3Rh dGljIHZvaWQgX19leGl0IGNsZWFudXBfbmV0Y29uc29sZSh2b2lkKQoreworCXByaW50ayhLRVJO X0lORk8gIm5ldGNvbnNvbGU6IG5ldHdvcmsgbG9nZ2luZyBzaHV0IGRvd24uXG4iKTsKKwl1bnJl Z2lzdGVyX2NvbnNvbGUoJm5ldGNvbnNvbGUpOworCisjZGVmaW5lIFNIVVRET1dOX01TRyAiWy4u Lm5ldHdvcmsgY29uc29sZSBzaHV0ZG93bi4uLl1cbiIKKwl3cml0ZV9uZXRjb25zb2xlX21zZyhO VUxMLCBTSFVURE9XTl9NU0csIHN0cmxlbihTSFVURE9XTl9NU0cpKTsKKwluZXRjb25zb2xlX2Rl diA9IE5VTEw7Cit9CisKK21vZHVsZV9pbml0KGluaXRfbmV0Y29uc29sZSk7Cittb2R1bGVfZXhp dChjbGVhbnVwX25ldGNvbnNvbGUpOworCitpbnQgZHVtbXkgPSBNQVhfU0tCX1NJWkU7Cg== --Multipart_Mon__10_Mar_2003_22:22:05_+0300_082e02e8-- From mochel@osdl.org Mon Mar 10 14:51:31 2003 Received: with ECARTIS (v1.0.0; list netdev); Mon, 10 Mar 2003 14:51:34 -0800 (PST) Received: from mail.osdl.org (air-2.osdl.org [65.172.181.6]) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h2AMpVq9031498 for ; Mon, 10 Mar 2003 14:51:31 -0800 Received: from localhost (build.pdx.osdl.net [172.20.1.2]) by mail.osdl.org (8.11.6/8.11.6) with ESMTP id h2AMpRG15885; Mon, 10 Mar 2003 14:51:27 -0800 Date: Mon, 10 Mar 2003 16:27:01 -0600 (CST) From: Patrick Mochel X-X-Sender: To: Andreas Jellinghaus cc: , , Subject: Re: 2.5.64 oops in ppp / pppo2 / kobject In-Reply-To: <1047336461.10548.3.camel@simulacron> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-archive-position: 1921 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: mochel@osdl.org Precedence: bulk X-list: netdev Content-Length: 1990 Lines: 79 On 10 Mar 2003, Andreas Jellinghaus wrote: > pppoe link failed, then ppp oopsed. > > Also, shutting down the system ends in a deadlock > (or so? nothing is happening, lots of processes in __down.) > > plain 2.5.64 plus ipv6/setkey patch (unreleated i think). Please try the latest BK snapshot from http://kernel.org/pub/linux/kernel/v2.5/snapshots/ Plus this patch on top it. This problem has been reported before, and this patch should fix it.. Thanks, -pat ===== fs/sysfs/dir.c 1.4 vs edited ===== --- 1.4/fs/sysfs/dir.c Sat Mar 8 23:42:32 2003 +++ edited/fs/sysfs/dir.c Sun Mar 9 16:01:45 2003 @@ -98,7 +98,6 @@ * Unlink and unhash. */ spin_unlock(&dcache_lock); - d_delete(d); simple_unlink(dentry->d_inode,d); dput(d); spin_lock(&dcache_lock); @@ -108,16 +107,11 @@ } spin_unlock(&dcache_lock); up(&dentry->d_inode->i_sem); - d_invalidate(dentry); - simple_rmdir(parent->d_inode,dentry); d_delete(dentry); + simple_rmdir(parent->d_inode,dentry); pr_debug(" o %s removing done (%d)\n",dentry->d_name.name, atomic_read(&dentry->d_count)); - /** - * Drop reference from initial sysfs_get_dentry(). - */ - dput(dentry); /** * Drop reference from dget() on entrance. ===== fs/sysfs/inode.c 1.83 vs edited ===== --- 1.83/fs/sysfs/inode.c Mon Mar 3 17:11:29 2003 +++ edited/fs/sysfs/inode.c Sun Mar 9 14:25:45 2003 @@ -93,19 +93,14 @@ /* make sure dentry is really there */ if (victim->d_inode && (victim->d_parent->d_inode == dir->d_inode)) { - simple_unlink(dir->d_inode,victim); - d_delete(victim); - pr_debug("sysfs: Removing %s (%d)\n", victim->d_name.name, atomic_read(&victim->d_count)); - /* - * Drop reference from initial sysfs_get_dentry(). - */ - dput(victim); + + simple_unlink(dir->d_inode,victim); + } - - /** - * Drop the reference acquired from sysfs_get_dentry() above. + /* + * Drop reference from sysfs_get_dentry() above. */ dput(victim); } From anton@samba.org Mon Mar 10 19:59:31 2003 Received: with ECARTIS (v1.0.0; list netdev); Mon, 10 Mar 2003 19:59:38 -0800 (PST) Received: from lists.samba.org (dp.samba.org [66.70.73.150]) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h2B3xKq9006081 for ; Mon, 10 Mar 2003 19:59:21 -0800 Received: by lists.samba.org (Postfix, from userid 504) id D98CB2C053; Tue, 11 Mar 2003 03:30:00 +0000 (GMT) Date: Tue, 11 Mar 2003 14:29:50 +1100 From: Anton Blanchard To: netdev@oss.sgi.com Subject: alignment of SKBs Message-ID: <20030311032950.GB1132@krispykreme> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.5.3i X-archive-position: 1922 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: anton@samba.org Precedence: bulk X-list: netdev Content-Length: 1280 Lines: 29 Hi, Linux likes to align TCP/IP headers for the benefit of the CPU. No problems there. However, especially with gigabit, it is important to try and minimise the number of PCI transactions. As an example, the e1000 driver currently starts most transmit packets 14 bytes from the start of a cacheline and all receive packets 18 bytes from the start of a cacheline. On ppc64 this is going to be expensive on the PCI bus, especially for DMA writes as we work our way up to cacheline aligment. Unaligned loads and stores on the headers should be reasonably quick on recent ppc64 machines, so the tradeoff is definitely towards optimising for the PCI bus. Unfortunately we cant do anything about it because skb_reserve() is used everywhere. Perhaps if we had another macro (skb_align?) we could override it on a per arch basis. While the receive side is easy to fix (modify the skb_reserve in the e1000 and dev_skb_alloc routines), the transmit side is more difficult. Luckily DMA reads tend to be less of an issue. From my reading of the code, on transmits we copy the data in before we put the TCP header together. I guess we could arrange things so that the common case would fall on a cacheline boundary and the uncommon case would overflow into the cacheline before. Anton From pekkas@netcore.fi Mon Mar 10 22:25:08 2003 Received: with ECARTIS (v1.0.0; list netdev); Mon, 10 Mar 2003 22:25:12 -0800 (PST) Received: from netcore.fi (netcore.fi [193.94.160.1]) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h2B6OQq9008644 for ; Mon, 10 Mar 2003 22:25:08 -0800 Received: from localhost (pekkas@localhost) by netcore.fi (8.11.6/8.11.6) with ESMTP id h2B6OKY00515 for ; Tue, 11 Mar 2003 08:24:20 +0200 Date: Tue, 11 Mar 2003 08:24:19 +0200 (EET) From: Pekka Savola To: netdev@oss.sgi.com Subject: Is RFC1822 -type License on IPR good enough? Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-archive-position: 1923 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: pekkas@netcore.fi Precedence: bulk X-list: netdev Content-Length: 988 Lines: 25 Hi, An issue has just come up when IETF is standardizing Secure Neighbor Discovery for IPv6. Some organizations, including Microsoft and Ericsson, have IPR claims on a mechanism which would be very useful to the mechanism. Folks are working with the organizations, hoping to get a licensing agreement like RFC1822. I'd like to solicit opinions whether this is considered "good enough" for possible implementation in Linux kernel, or whether it would lock us out. (Also remember the tradeoff: if this technique is unacceptable, there are no easy alternatives to solving the problem, only very difficult ones). I'm assuming the license like that would be appropriate as it has been used in other free systems in protocols like IKE. If you think there is a problem, please send a note ASAP. -- Pekka Savola "You each name yourselves king, yet the Netcore Oy kingdom bleeds." Systems. Networks. Security. -- George R.R. Martin: A Clash of Kings From ahu@outpost.ds9a.nl Tue Mar 11 01:45:04 2003 Received: with ECARTIS (v1.0.0; list netdev); Tue, 11 Mar 2003 01:45:14 -0800 (PST) Received: from outpost.ds9a.nl (postfix@outpost.ds9a.nl [213.244.168.210]) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h2B9iMq9016508 for ; Tue, 11 Mar 2003 01:45:04 -0800 Received: by outpost.ds9a.nl (Postfix, from userid 1000) id 615B64573; Tue, 11 Mar 2003 10:44:20 +0100 (CET) Date: Tue, 11 Mar 2003 10:44:20 +0100 From: bert hubert To: Alexey Kuznetsov , Martin Devera , Linux Kernel Mailinlist , David Jarvis , netdev@oss.sgi.com Subject: Re: kernel panic: bug in sch_sfq.c Message-ID: <20030311094420.GB19658@outpost.ds9a.nl> Mail-Followup-To: bert hubert , Alexey Kuznetsov , Martin Devera , Linux Kernel Mailinlist , David Jarvis , netdev@oss.sgi.com References: <20030311091409.GA4491@oasis.frogfoot.net> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20030311091409.GA4491@oasis.frogfoot.net> User-Agent: Mutt/1.3.28i X-archive-position: 1924 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: ahu@ds9a.nl Precedence: bulk X-list: netdev Content-Length: 6312 Lines: 138 On Tue, Mar 11, 2003 at 11:14:09AM +0200, Abraham van der Merwe wrote: > Hi! > > I have a box that crashed today. Below is the decoded kernel panic. If you > track down the bug PLEASE send me a patch. Weird, Alexeys code is normally very very solid. Perhaps HTB is also involved. Devik? > > ------------< snip <------< snip <------< snip <------------ > ksymoops 2.4.8 on i686 2.4.20-rc1. Options used > -v vmlinux-2.4.21-pre5 (specified) > -K (specified) > -L (specified) > -O (specified) > -m System.map-2.4.21-pre5 (specified) > > Unable to handle kernel NULL pointer dereference at virtual address 00000004 > *pde = 00000000 > Oops: 0002 > CPU: 0 > EIP: 0010:[] Not tainted > Using defaults from ksymoops -t elf32-i386 -a i386 > EFLAGS: 00010202 > eax: 00000000 ebx: c7b9a9e8 ecx: 0000007f edx: c7a8eef8 > esi: c7b9ab08 edi: 000007f0 ebp: c7a8e060 esp: c021deb8 > ds: 0018 es: 0018 ss: 0018 > Process swapper (pid: 0, stackpage=c021d000) > Stack: c7b9a9e8 c7b9ab08 c7f7ee00 c7b9a860 c7b893c0 c7f7ee00 c7b9a860 00000000 > c01a3507 c7b5c680 7fb9a9f0 c01a339e c7a8e000 ffffffff 00000018 00000006 > c7b9a800 00000018 00000006 c7b9a800 c7b9a9e8 c7b9ab08 c7f7ee00 c01a371c > Call Trace: [] [] [] [] [>c019949d>] > [] [] [] [] [] [] > [] [] [] [] [] > Code: 89 50 04 89 02 8b 5c 24 24 c7 03 00 00 00 00 c7 43 04 00 00 > > > >>EIP; c01a5399 <===== > > >>esp; c021deb8 > > Trace; c01a3507 > Trace; c01a339e > Trace; c01a371c > Trace; c019f7a3 > Trace; c0115a6a > Trace; c01082bd > Trace; c0105240 > Trace; c0105240 > Trace; c010a528 > Trace; c0105240 > Trace; c0105240 > Trace; c0105263 > Trace; c01052d2 > Trace; c0105000 <_stext+0/0> > Trace; c0105027 > > Code; c01a5399 > 00000000 <_EIP>: > Code; c01a5399 <===== > 0: 89 50 04 mov %edx,0x4(%eax) <===== > Code; c01a539c > 3: 89 02 mov %eax,(%edx) > Code; c01a539e > 5: 8b 5c 24 24 mov 0x24(%esp,1),%ebx > Code; c01a53a2 > 9: c7 03 00 00 00 00 movl $0x0,(%ebx) > Code; c01a53a8 > f: c7 43 04 00 00 00 00 movl $0x0,0x4(%ebx) > > <0>Kernel panic: Aiee, killing interrupt handler! > ------------< snip <------< snip <------< snip <------------ > > Below are the rules that were installed on the system: > > ------------< snip <------< snip <------< snip <------------ > /sbin/tc qdisc del dev eth0 root > /sbin/tc qdisc del dev eth1 root > /sbin/iptables -t mangle -F qos > /sbin/iptables -t mangle -Z qos > /sbin/tc qdisc add dev eth0 root handle 1: htb default 5 r2q 1 > /sbin/tc class add dev eth0 parent 1: classid 1:1 htb rate 96kbit > /sbin/tc class add dev eth0 parent 1:1 classid 1:2 htb rate 96kbit ceil 96kbit > /sbin/tc class add dev eth0 parent 1:2 classid 1:3 htb rate 48kbit ceil 96kbit prio 1 > /sbin/tc qdisc add dev eth0 handle 3: parent 1:3 sfq perturb 10 limit 31 > /sbin/tc class add dev eth0 parent 1:2 classid 1:4 htb rate 24kbit ceil 96kbit prio 1 > /sbin/tc qdisc add dev eth0 handle 4: parent 1:4 sfq perturb 10 limit 31 > /sbin/tc class add dev eth0 parent 1:2 classid 1:5 htb rate 16kbit ceil 96kbit prio 2 > /sbin/tc qdisc add dev eth0 handle 5: parent 1:5 sfq perturb 10 limit 31 > /sbin/iptables -t mangle -A qos -o eth0 -s 66.8.85.0/28 -j CLASSIFY --set-class 1:3 > /sbin/iptables -t mangle -A qos -o eth0 -s 66.8.85.80/28 -j CLASSIFY --set-class 1:4 > /sbin/iptables -t mangle -A qos -o eth0 -s 192.116.106.192/29 -j CLASSIFY --set-class 1:0 > /sbin/iptables -t mangle -A qos -o eth0 -s 66.8.28.48/29 -j CLASSIFY --set-class 1:0 > /sbin/tc qdisc add dev eth1 root handle 1: htb default 5 r2q 2 > /sbin/tc class add dev eth1 parent 1: classid 1:1 htb rate 512kbit > /sbin/tc class add dev eth1 parent 1:1 classid 1:2 htb rate 256kbit ceil 512kbit > /sbin/tc class add dev eth1 parent 1:2 classid 1:3 htb rate 128kbit ceil 512kbit prio 1 > /sbin/tc qdisc add dev eth1 handle 3: parent 1:3 sfq perturb 10 limit 169 > /sbin/tc class add dev eth1 parent 1:2 classid 1:4 htb rate 64kbit ceil 512kbit prio 1 > /sbin/tc qdisc add dev eth1 handle 4: parent 1:4 sfq perturb 10 limit 169 > /sbin/tc class add dev eth1 parent 1:2 classid 1:5 htb rate 32kbit ceil 512kbit prio 2 > /sbin/tc qdisc add dev eth1 handle 5: parent 1:5 sfq perturb 10 limit 169 > /sbin/iptables -t mangle -A qos -o eth1 -d 66.8.85.0/28 -j CLASSIFY --set-class 1:3 > /sbin/iptables -t mangle -A qos -o eth1 -d 66.8.85.80/28 -j CLASSIFY --set-class 1:4 > /sbin/iptables -t mangle -A qos -o eth1 -d 192.116.106.192/29 -j CLASSIFY --set-class 1:0 > /sbin/iptables -t mangle -A qos -o eth1 -d 66.8.28.48/29 -j CLASSIFY --set-class 1:0 > ------------< snip <------< snip <------< snip <------------ > > I've made tons of info available on my home page for you to look at (proc > files, vmlinux, System.map, original panic message, etc. > > http://oasis.frogfoot.net/sfq/ > > -- > > Regards > Abraham > > I saw what you did and I know who you are. > > ___________________________________________________ > Abraham vd Merwe [ZR1BBQ] - Frogfoot Networks > P.O. Box 3472, Matieland, Stellenbosch, 7602 > Cell: +27 82 565 4451 Http: http://www.frogfoot.net/ > Email: abz@frogfoot.net > > - > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ > -- http://www.PowerDNS.com Open source, database driven DNS Software http://lartc.org Linux Advanced Routing & Traffic Control HOWTO http://netherlabs.nl Consulting From erik@hensema.net Tue Mar 11 03:18:14 2003 Received: with ECARTIS (v1.0.0; list netdev); Tue, 11 Mar 2003 03:18:16 -0800 (PST) Received: from dexter.hensema.net (cc78409-a.hnglo1.ov.home.nl [212.120.97.185]) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h2BBIBq9021010 for ; Tue, 11 Mar 2003 03:18:13 -0800 Received: from bender.home.hensema.net (bender.home.hensema.net [192.168.1.252]) by dexter.hensema.net (8.12.3/8.12.3) with ESMTP id h2BBI2p4017374; Tue, 11 Mar 2003 12:18:02 +0100 Received: from bender.home.hensema.net (localhost [127.0.0.1]) by bender.home.hensema.net (8.12.3/8.12.3) with ESMTP id h2BBI2T3001911; Tue, 11 Mar 2003 12:18:02 +0100 Received: (from erik@localhost) by bender.home.hensema.net (8.12.3/8.12.3/Submit) id h2BBI14s001910; Tue, 11 Mar 2003 12:18:01 +0100 Date: Tue, 11 Mar 2003 12:18:01 +0100 From: Erik Hensema To: netdev@oss.sgi.com Cc: LARTC , Netfilter Development Mailinglist Subject: [PATCH 2.4.21-pre4] Propagate netfilter MARK value when tunneling Message-ID: <20030311111801.GA1853@hensema.net> Reply-To: erik@hensema.net Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="k1lZvvs/B4yU6o8G" Content-Disposition: inline User-Agent: Mutt/1.3.27i X-archive-position: 1925 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: erik@hensema.net Precedence: bulk X-list: netdev Content-Length: 3484 Lines: 99 --k1lZvvs/B4yU6o8G Content-Type: text/plain; charset=us-ascii Content-Disposition: inline This patch enables the user to propagate netfilter MARK values from tunneled packets to the tunnel packets. The primary use for this is QoS: it enables you to MARK a packet before it enters a tunnel and then later pick up the packet when it's about to leave the physical interface. jamal suggested to also propagate other skb specifics like the tcindex and priority. I haven't included these in the current patch for the very simple reason that I don't understand what they mean ;-) The patch is currently limited to GRE, IPIP and SIT. Patch is attached to this mail, but also can be downloaded from http://dexter.hensema.net/~erik/patches/netfilter-propagate-mark-2.4.21-pre4.diff -- Erik Hensema (erik@hensema.net) --k1lZvvs/B4yU6o8G Content-Type: text/plain; charset=us-ascii Content-Disposition: attachment; filename="netfilter-propagate-mark-2.4.21-pre4.diff" --- ../linux-2.4.21-pre4/net/Config.in Sat Aug 3 02:39:46 2002 +++ net/Config.in Tue Mar 11 12:08:29 2003 @@ -13,6 +13,7 @@ bool 'Network packet filtering (replaces ipchains)' CONFIG_NETFILTER if [ "$CONFIG_NETFILTER" = "y" ]; then bool ' Network packet filtering debugging' CONFIG_NETFILTER_DEBUG + bool ' Propagate netfilter MARK value when tunneling' CONFIG_NETFILTER_PROPAGATE_MARK fi bool 'Socket Filtering' CONFIG_FILTER tristate 'Unix domain sockets' CONFIG_UNIX --- ../linux-2.4.21-pre4/net/ipv4/ipip.c Fri Nov 29 00:53:15 2002 +++ net/ipv4/ipip.c Tue Mar 11 11:58:50 2003 @@ -619,6 +619,9 @@ } if (skb->sk) skb_set_owner_w(new_skb, skb->sk); +#ifdef CONFIG_NETFILTER_PROPAGATE_MARK + new_skb->nfmark = skb->nfmark; +#endif dev_kfree_skb(skb); skb = new_skb; } --- ../linux-2.4.21-pre4/net/ipv4/ip_gre.c Fri Nov 29 00:53:15 2002 +++ net/ipv4/ip_gre.c Tue Mar 11 11:59:07 2003 @@ -822,6 +822,9 @@ } if (skb->sk) skb_set_owner_w(new_skb, skb->sk); +#ifdef CONFIG_NETFILTER_PROPAGATE_MARK + new_skb->nfmark = skb->nfmark; +#endif dev_kfree_skb(skb); skb = new_skb; } --- ../linux-2.4.21-pre4/net/ipv6/sit.c Fri Nov 29 00:53:15 2002 +++ net/ipv6/sit.c Tue Mar 11 11:59:20 2003 @@ -571,6 +571,9 @@ } if (skb->sk) skb_set_owner_w(new_skb, skb->sk); +#ifdef CONFIG_NETFILTER_PROPAGATE_MARK + new_skb->nfmark = skb->nfmark; +#endif dev_kfree_skb(skb); skb = new_skb; } --- ../linux-2.4.21-pre4/Documentation/Configure.help Wed Feb 26 10:51:16 2003 +++ Documentation/Configure.help Tue Mar 11 12:05:37 2003 @@ -2507,6 +2507,22 @@ You can say Y here if you want to get additional messages useful in debugging the netfilter code. +Propagate netfilter MARK value when tunneling +CONFIG_NETFILTER_PROPAGATE_MARK + With this option enabled, netfilter MARK values are propagated from + tunneled packets to the tunnel packets. It enables you to trace + packets from before they enter the tunnel to the point where they + leave the physical interface. + + One of the possible uses is marking packets for QoS before they + enter a tunnel. These mark values can then be picked up by filters + defined by the "tc" utility when they're about the leave the + physical interface. + + This option currently works for GRE, IPIP and SIT tunnels. + + If unsure, say N. + Connection tracking (required for masq/NAT) CONFIG_IP_NF_CONNTRACK Connection tracking keeps a record of what packets have passed --k1lZvvs/B4yU6o8G-- From devik@cdi.cz Tue Mar 11 04:06:08 2003 Received: with ECARTIS (v1.0.0; list netdev); Tue, 11 Mar 2003 04:06:12 -0800 (PST) Received: from luxik.cdi.cz (inway106.cdi.cz [213.151.81.106]) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h2BC55q9021871 for ; Tue, 11 Mar 2003 04:05:58 -0800 Received: from a76-137.dialup.iol.cz ([194.228.137.76] helo=devix) by luxik.cdi.cz with asmtp (Exim 3.34 #3) id 18siSX-0001Az-00; Tue, 11 Mar 2003 13:02:22 +0100 Received: from devik (helo=localhost) by devix with local-esmtp (Exim 3.16 #8) id 18siME-0000J1-00; Tue, 11 Mar 2003 12:55:46 +0100 Date: Tue, 11 Mar 2003 12:55:46 +0100 (CET) From: devik X-X-Sender: To: bert hubert cc: Alexey Kuznetsov , Linux Kernel Mailinlist , David Jarvis , Subject: Re: kernel panic: bug in sch_sfq.c In-Reply-To: <20030311094420.GB19658@outpost.ds9a.nl> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-archive-position: 1926 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: devik@cdi.cz Precedence: bulk X-list: netdev Content-Length: 7581 Lines: 167 Hmm, I looked at it. It seems that skb linked list was corrupted (containing NULL pointer). It could be because of two problems, either someone (maybe htb too)'ve overwritten memory or HTB called dequeue with wrong argument. Latter is unlikely because I call q->dequeue and sfq's dequeue was really called. Thus pointer is ok. Let's examine how could HTB mungle with qdisc internals. If htb would think that leaf is inner node, inner.feed[0] is pointer equal to leaf.q. I examined code but there is no way to make this mistake. Last 3 days I got 3 bugreports. Each crashes in different place and all seem unrelated. Each is NULL pointer dereference though. I think there is some place in some code which writes in bad random places in memory :-\ To ask all people whose have seen such Oops, have you used dynamic tc classes changes ? Like creating/deleting/changing/viewving classes offten at runtime ? (I'm trying to find common trigger). thanks, devik HTB maintainer On Tue, 11 Mar 2003, bert hubert wrote: > On Tue, Mar 11, 2003 at 11:14:09AM +0200, Abraham van der Merwe wrote: > > Hi! > > > > I have a box that crashed today. Below is the decoded kernel panic. If you > > track down the bug PLEASE send me a patch. > > Weird, Alexeys code is normally very very solid. Perhaps HTB is also > involved. Devik? > > > > > ------------< snip <------< snip <------< snip <------------ > > ksymoops 2.4.8 on i686 2.4.20-rc1. Options used > > -v vmlinux-2.4.21-pre5 (specified) > > -K (specified) > > -L (specified) > > -O (specified) > > -m System.map-2.4.21-pre5 (specified) > > > > Unable to handle kernel NULL pointer dereference at virtual address 00000004 > > *pde = 00000000 > > Oops: 0002 > > CPU: 0 > > EIP: 0010:[] Not tainted > > Using defaults from ksymoops -t elf32-i386 -a i386 > > EFLAGS: 00010202 > > eax: 00000000 ebx: c7b9a9e8 ecx: 0000007f edx: c7a8eef8 > > esi: c7b9ab08 edi: 000007f0 ebp: c7a8e060 esp: c021deb8 > > ds: 0018 es: 0018 ss: 0018 > > Process swapper (pid: 0, stackpage=c021d000) > > Stack: c7b9a9e8 c7b9ab08 c7f7ee00 c7b9a860 c7b893c0 c7f7ee00 c7b9a860 00000000 > > c01a3507 c7b5c680 7fb9a9f0 c01a339e c7a8e000 ffffffff 00000018 00000006 > > c7b9a800 00000018 00000006 c7b9a800 c7b9a9e8 c7b9ab08 c7f7ee00 c01a371c > > Call Trace: [] [] [] [] [>c019949d>] > > [] [] [] [] [] [] > > [] [] [] [] [] > > Code: 89 50 04 89 02 8b 5c 24 24 c7 03 00 00 00 00 c7 43 04 00 00 > > > > > > >>EIP; c01a5399 <===== > > > > >>esp; c021deb8 > > > > Trace; c01a3507 > > Trace; c01a339e > > Trace; c01a371c > > Trace; c019f7a3 > > Trace; c0115a6a > > Trace; c01082bd > > Trace; c0105240 > > Trace; c0105240 > > Trace; c010a528 > > Trace; c0105240 > > Trace; c0105240 > > Trace; c0105263 > > Trace; c01052d2 > > Trace; c0105000 <_stext+0/0> > > Trace; c0105027 > > > > Code; c01a5399 > > 00000000 <_EIP>: > > Code; c01a5399 <===== > > 0: 89 50 04 mov %edx,0x4(%eax) <===== > > Code; c01a539c > > 3: 89 02 mov %eax,(%edx) > > Code; c01a539e > > 5: 8b 5c 24 24 mov 0x24(%esp,1),%ebx > > Code; c01a53a2 > > 9: c7 03 00 00 00 00 movl $0x0,(%ebx) > > Code; c01a53a8 > > f: c7 43 04 00 00 00 00 movl $0x0,0x4(%ebx) > > > > <0>Kernel panic: Aiee, killing interrupt handler! > > ------------< snip <------< snip <------< snip <------------ > > > > Below are the rules that were installed on the system: > > > > ------------< snip <------< snip <------< snip <------------ > > /sbin/tc qdisc del dev eth0 root > > /sbin/tc qdisc del dev eth1 root > > /sbin/iptables -t mangle -F qos > > /sbin/iptables -t mangle -Z qos > > /sbin/tc qdisc add dev eth0 root handle 1: htb default 5 r2q 1 > > /sbin/tc class add dev eth0 parent 1: classid 1:1 htb rate 96kbit > > /sbin/tc class add dev eth0 parent 1:1 classid 1:2 htb rate 96kbit ceil 96kbit > > /sbin/tc class add dev eth0 parent 1:2 classid 1:3 htb rate 48kbit ceil 96kbit prio 1 > > /sbin/tc qdisc add dev eth0 handle 3: parent 1:3 sfq perturb 10 limit 31 > > /sbin/tc class add dev eth0 parent 1:2 classid 1:4 htb rate 24kbit ceil 96kbit prio 1 > > /sbin/tc qdisc add dev eth0 handle 4: parent 1:4 sfq perturb 10 limit 31 > > /sbin/tc class add dev eth0 parent 1:2 classid 1:5 htb rate 16kbit ceil 96kbit prio 2 > > /sbin/tc qdisc add dev eth0 handle 5: parent 1:5 sfq perturb 10 limit 31 > > /sbin/iptables -t mangle -A qos -o eth0 -s 66.8.85.0/28 -j CLASSIFY --set-class 1:3 > > /sbin/iptables -t mangle -A qos -o eth0 -s 66.8.85.80/28 -j CLASSIFY --set-class 1:4 > > /sbin/iptables -t mangle -A qos -o eth0 -s 192.116.106.192/29 -j CLASSIFY --set-class 1:0 > > /sbin/iptables -t mangle -A qos -o eth0 -s 66.8.28.48/29 -j CLASSIFY --set-class 1:0 > > /sbin/tc qdisc add dev eth1 root handle 1: htb default 5 r2q 2 > > /sbin/tc class add dev eth1 parent 1: classid 1:1 htb rate 512kbit > > /sbin/tc class add dev eth1 parent 1:1 classid 1:2 htb rate 256kbit ceil 512kbit > > /sbin/tc class add dev eth1 parent 1:2 classid 1:3 htb rate 128kbit ceil 512kbit prio 1 > > /sbin/tc qdisc add dev eth1 handle 3: parent 1:3 sfq perturb 10 limit 169 > > /sbin/tc class add dev eth1 parent 1:2 classid 1:4 htb rate 64kbit ceil 512kbit prio 1 > > /sbin/tc qdisc add dev eth1 handle 4: parent 1:4 sfq perturb 10 limit 169 > > /sbin/tc class add dev eth1 parent 1:2 classid 1:5 htb rate 32kbit ceil 512kbit prio 2 > > /sbin/tc qdisc add dev eth1 handle 5: parent 1:5 sfq perturb 10 limit 169 > > /sbin/iptables -t mangle -A qos -o eth1 -d 66.8.85.0/28 -j CLASSIFY --set-class 1:3 > > /sbin/iptables -t mangle -A qos -o eth1 -d 66.8.85.80/28 -j CLASSIFY --set-class 1:4 > > /sbin/iptables -t mangle -A qos -o eth1 -d 192.116.106.192/29 -j CLASSIFY --set-class 1:0 > > /sbin/iptables -t mangle -A qos -o eth1 -d 66.8.28.48/29 -j CLASSIFY --set-class 1:0 > > ------------< snip <------< snip <------< snip <------------ > > > > I've made tons of info available on my home page for you to look at (proc > > files, vmlinux, System.map, original panic message, etc. > > > > http://oasis.frogfoot.net/sfq/ > > > > -- > > > > Regards > > Abraham > > > > I saw what you did and I know who you are. > > > > ___________________________________________________ > > Abraham vd Merwe [ZR1BBQ] - Frogfoot Networks > > P.O. Box 3472, Matieland, Stellenbosch, 7602 > > Cell: +27 82 565 4451 Http: http://www.frogfoot.net/ > > Email: abz@frogfoot.net > > > > - > > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > > the body of a message to majordomo@vger.kernel.org > > More majordomo info at http://vger.kernel.org/majordomo-info.html > > Please read the FAQ at http://www.tux.org/lkml/ > > > > -- > http://www.PowerDNS.com Open source, database driven DNS Software > http://lartc.org Linux Advanced Routing & Traffic Control HOWTO > http://netherlabs.nl Consulting > From jmorris@intercode.com.au Tue Mar 11 04:40:39 2003 Received: with ECARTIS (v1.0.0; list netdev); Tue, 11 Mar 2003 04:40:42 -0800 (PST) Received: from blackbird.intercode.com.au (IDENT:root@blackbird.intercode.com.au [203.32.101.10]) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h2BCeaq9023182 for ; Tue, 11 Mar 2003 04:40:38 -0800 Received: from localhost (jmorris@localhost) by blackbird.intercode.com.au (8.11.6/8.9.3) with ESMTP id h2BCe5c31557; Tue, 11 Mar 2003 23:40:05 +1100 Date: Tue, 11 Mar 2003 23:40:05 +1100 (EST) From: James Morris To: Alan Cox cc: Ulrik De Bie , , Subject: Re: Fwd: tcp seq nr wrapping bug + patch In-Reply-To: Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-archive-position: 1927 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jmorris@intercode.com.au Precedence: bulk X-list: netdev Content-Length: 712 Lines: 29 On Mon, 10 Mar 2003, Ulrik De Bie wrote: > I resend this patch which fixes a stupid mistake in the tcp sequence > number in the 2.2 kernel. This looks good, thanks. Alan, please apply. - James -- James Morris diff -urN -X dontdiff linux-2.2.24.orig/net/ipv4/tcp.c linux-2.2.24.w1/net/ipv4/tcp.c --- linux-2.2.24.orig/net/ipv4/tcp.c Wed Sep 25 00:06:26 2002 +++ linux-2.2.24.w1/net/ipv4/tcp.c Tue Mar 11 23:26:00 2003 @@ -823,7 +823,7 @@ */ if (skb_tailroom(skb) > 0 && (mss_now - copy) > 0 && - tp->snd_nxt < TCP_SKB_CB(skb)->end_seq) { + before(tp->snd_nxt, TCP_SKB_CB(skb)->end_seq)) { int last_byte_was_odd = (copy % 4); /* From kuznet@ms2.inr.ac.ru Tue Mar 11 08:09:50 2003 Received: with ECARTIS (v1.0.0; list netdev); Tue, 11 Mar 2003 08:09:59 -0800 (PST) Received: from sex.inr.ac.ru (sex.inr.ac.ru [193.233.7.165]) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h2BG95q9004694 for ; Tue, 11 Mar 2003 08:09:49 -0800 Received: (from kuznet@localhost) by sex.inr.ac.ru (8.6.13/ANK) id TAA13074; Tue, 11 Mar 2003 19:08:18 +0300 From: kuznet@ms2.inr.ac.ru Message-Id: <200303111608.TAA13074@sex.inr.ac.ru> Subject: Re: kernel panic: bug in sch_sfq.c To: abz@frogfoot.net (Abraham van der Merwe) Date: Tue, 11 Mar 2003 19:08:18 +0300 (MSK) Cc: devik@cdi.cz, ahu@ds9a.nl, linux-kernel@vger.kernel.org, david@uninetwork.co.za, netdev@oss.sgi.com In-Reply-To: <20030311155409.GB7641@oasis.frogfoot.net> from "Abraham van der Merwe" at Mar 11, 3 05:54:09 pm X-Mailer: ELM [version 2.4 PL24] MIME-Version: 1.0 X-archive-position: 1928 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: kuznet@ms2.inr.ac.ru Precedence: bulk X-list: netdev Content-Length: 192 Lines: 8 Hello! > Also, if I compile the kernel with all debugging enabled (CONFIG_DEBUG_SLAB, > etc) I can reliably trigger the BUG() on line 1263 in mm/slab.c How does backtrace oops look? Alexey From abz@oasis.frogfoot.net Tue Mar 11 08:46:06 2003 Received: with ECARTIS (v1.0.0; list netdev); Tue, 11 Mar 2003 08:46:13 -0800 (PST) Received: from oasis.frogfoot.net (oasis.frogfoot.net [66.8.28.51]) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h2BGjtq9005232 for ; Tue, 11 Mar 2003 08:45:59 -0800 Received: (qmail 7880 invoked by uid 1001); 11 Mar 2003 15:54:09 -0000 Date: Tue, 11 Mar 2003 17:54:09 +0200 From: Abraham van der Merwe To: devik Cc: bert hubert , Alexey Kuznetsov , Linux Kernel Mailinlist , David Jarvis , netdev@oss.sgi.com Subject: Re: kernel panic: bug in sch_sfq.c Message-ID: <20030311155409.GB7641@oasis.frogfoot.net> Mail-Followup-To: devik , bert hubert , Alexey Kuznetsov , Linux Kernel Mailinlist , David Jarvis , netdev@oss.sgi.com References: <20030311094420.GB19658@outpost.ds9a.nl> Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.3.28i Organization: Frogfoot Networks X-Operating-System: Debian GNU/Linux oasis 2.4.20-rc1 i686 X-GPG-Public-Key: http://oasis.frogfoot.net/pgpkeys/keys/frogfoot.gpg X-Uptime: 17:40:32 up 69 days, 5:07, 6 users, load average: 0.00, 0.03, 0.00 X-Edited-With-Muttmode: muttmail.sl - 2001-09-27 X-archive-position: 1929 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: abz@frogfoot.net Precedence: bulk X-list: netdev Content-Length: 8729 Lines: 185 Hi devik! In this case, I added the rules the morning, but there was no traffic flowing through the machine. That evening we redirected traffic through the box (so HTB obviously kicked in). About 10 minutes later it crashed. I think the only common thing I've seen so far is that it crashes when htb is actually doing a lot of shaping/prioritizing - that doesn't help much. Also, if I compile the kernel with all debugging enabled (CONFIG_DEBUG_SLAB, etc) I can reliably trigger the BUG() on line 1263 in mm/slab.c - don't know if this is related to the HTB problem - I've tested this on many different machines with many different kernels. If I disable CONFIG_DEBUG_SLAB, I don't get that (obviously). The first crash since I've started disabling CONFIG_DEBUG_SLAB on our Linux QoS boxes have been this one. > Hmm, > I looked at it. It seems that skb linked list was corrupted > (containing NULL pointer). It could be because of two problems, > either someone (maybe htb too)'ve overwritten memory or HTB > called dequeue with wrong argument. > Latter is unlikely because I call q->dequeue and sfq's > dequeue was really called. Thus pointer is ok. > Let's examine how could HTB mungle with qdisc internals. If > htb would think that leaf is inner node, inner.feed[0] is > pointer equal to leaf.q. I examined code but there is no > way to make this mistake. > > Last 3 days I got 3 bugreports. Each crashes in different > place and all seem unrelated. Each is NULL pointer dereference > though. > I think there is some place in some code which writes in > bad random places in memory :-\ > > To ask all people whose have seen such Oops, have you used > dynamic tc classes changes ? Like creating/deleting/changing/viewving > classes offten at runtime ? (I'm trying to find common trigger). > > thanks, devik > HTB maintainer > > On Tue, 11 Mar 2003, bert hubert wrote: > > > On Tue, Mar 11, 2003 at 11:14:09AM +0200, Abraham van der Merwe wrote: > > > Hi! > > > > > > I have a box that crashed today. Below is the decoded kernel panic. If you > > > track down the bug PLEASE send me a patch. > > > > Weird, Alexeys code is normally very very solid. Perhaps HTB is also > > involved. Devik? > > > > > > > > ------------< snip <------< snip <------< snip <------------ > > > ksymoops 2.4.8 on i686 2.4.20-rc1. Options used > > > -v vmlinux-2.4.21-pre5 (specified) > > > -K (specified) > > > -L (specified) > > > -O (specified) > > > -m System.map-2.4.21-pre5 (specified) > > > > > > Unable to handle kernel NULL pointer dereference at virtual address 00000004 > > > *pde = 00000000 > > > Oops: 0002 > > > CPU: 0 > > > EIP: 0010:[] Not tainted > > > Using defaults from ksymoops -t elf32-i386 -a i386 > > > EFLAGS: 00010202 > > > eax: 00000000 ebx: c7b9a9e8 ecx: 0000007f edx: c7a8eef8 > > > esi: c7b9ab08 edi: 000007f0 ebp: c7a8e060 esp: c021deb8 > > > ds: 0018 es: 0018 ss: 0018 > > > Process swapper (pid: 0, stackpage=c021d000) > > > Stack: c7b9a9e8 c7b9ab08 c7f7ee00 c7b9a860 c7b893c0 c7f7ee00 c7b9a860 00000000 > > > c01a3507 c7b5c680 7fb9a9f0 c01a339e c7a8e000 ffffffff 00000018 00000006 > > > c7b9a800 00000018 00000006 c7b9a800 c7b9a9e8 c7b9ab08 c7f7ee00 c01a371c > > > Call Trace: [] [] [] [] [>c019949d>] > > > [] [] [] [] [] [] > > > [] [] [] [] [] > > > Code: 89 50 04 89 02 8b 5c 24 24 c7 03 00 00 00 00 c7 43 04 00 00 > > > > > > > > > >>EIP; c01a5399 <===== > > > > > > >>esp; c021deb8 > > > > > > Trace; c01a3507 > > > Trace; c01a339e > > > Trace; c01a371c > > > Trace; c019f7a3 > > > Trace; c0115a6a > > > Trace; c01082bd > > > Trace; c0105240 > > > Trace; c0105240 > > > Trace; c010a528 > > > Trace; c0105240 > > > Trace; c0105240 > > > Trace; c0105263 > > > Trace; c01052d2 > > > Trace; c0105000 <_stext+0/0> > > > Trace; c0105027 > > > > > > Code; c01a5399 > > > 00000000 <_EIP>: > > > Code; c01a5399 <===== > > > 0: 89 50 04 mov %edx,0x4(%eax) <===== > > > Code; c01a539c > > > 3: 89 02 mov %eax,(%edx) > > > Code; c01a539e > > > 5: 8b 5c 24 24 mov 0x24(%esp,1),%ebx > > > Code; c01a53a2 > > > 9: c7 03 00 00 00 00 movl $0x0,(%ebx) > > > Code; c01a53a8 > > > f: c7 43 04 00 00 00 00 movl $0x0,0x4(%ebx) > > > > > > <0>Kernel panic: Aiee, killing interrupt handler! > > > ------------< snip <------< snip <------< snip <------------ > > > > > > Below are the rules that were installed on the system: > > > > > > ------------< snip <------< snip <------< snip <------------ > > > /sbin/tc qdisc del dev eth0 root > > > /sbin/tc qdisc del dev eth1 root > > > /sbin/iptables -t mangle -F qos > > > /sbin/iptables -t mangle -Z qos > > > /sbin/tc qdisc add dev eth0 root handle 1: htb default 5 r2q 1 > > > /sbin/tc class add dev eth0 parent 1: classid 1:1 htb rate 96kbit > > > /sbin/tc class add dev eth0 parent 1:1 classid 1:2 htb rate 96kbit ceil 96kbit > > > /sbin/tc class add dev eth0 parent 1:2 classid 1:3 htb rate 48kbit ceil 96kbit prio 1 > > > /sbin/tc qdisc add dev eth0 handle 3: parent 1:3 sfq perturb 10 limit 31 > > > /sbin/tc class add dev eth0 parent 1:2 classid 1:4 htb rate 24kbit ceil 96kbit prio 1 > > > /sbin/tc qdisc add dev eth0 handle 4: parent 1:4 sfq perturb 10 limit 31 > > > /sbin/tc class add dev eth0 parent 1:2 classid 1:5 htb rate 16kbit ceil 96kbit prio 2 > > > /sbin/tc qdisc add dev eth0 handle 5: parent 1:5 sfq perturb 10 limit 31 > > > /sbin/iptables -t mangle -A qos -o eth0 -s 66.8.85.0/28 -j CLASSIFY --set-class 1:3 > > > /sbin/iptables -t mangle -A qos -o eth0 -s 66.8.85.80/28 -j CLASSIFY --set-class 1:4 > > > /sbin/iptables -t mangle -A qos -o eth0 -s 192.116.106.192/29 -j CLASSIFY --set-class 1:0 > > > /sbin/iptables -t mangle -A qos -o eth0 -s 66.8.28.48/29 -j CLASSIFY --set-class 1:0 > > > /sbin/tc qdisc add dev eth1 root handle 1: htb default 5 r2q 2 > > > /sbin/tc class add dev eth1 parent 1: classid 1:1 htb rate 512kbit > > > /sbin/tc class add dev eth1 parent 1:1 classid 1:2 htb rate 256kbit ceil 512kbit > > > /sbin/tc class add dev eth1 parent 1:2 classid 1:3 htb rate 128kbit ceil 512kbit prio 1 > > > /sbin/tc qdisc add dev eth1 handle 3: parent 1:3 sfq perturb 10 limit 169 > > > /sbin/tc class add dev eth1 parent 1:2 classid 1:4 htb rate 64kbit ceil 512kbit prio 1 > > > /sbin/tc qdisc add dev eth1 handle 4: parent 1:4 sfq perturb 10 limit 169 > > > /sbin/tc class add dev eth1 parent 1:2 classid 1:5 htb rate 32kbit ceil 512kbit prio 2 > > > /sbin/tc qdisc add dev eth1 handle 5: parent 1:5 sfq perturb 10 limit 169 > > > /sbin/iptables -t mangle -A qos -o eth1 -d 66.8.85.0/28 -j CLASSIFY --set-class 1:3 > > > /sbin/iptables -t mangle -A qos -o eth1 -d 66.8.85.80/28 -j CLASSIFY --set-class 1:4 > > > /sbin/iptables -t mangle -A qos -o eth1 -d 192.116.106.192/29 -j CLASSIFY --set-class 1:0 > > > /sbin/iptables -t mangle -A qos -o eth1 -d 66.8.28.48/29 -j CLASSIFY --set-class 1:0 > > > ------------< snip <------< snip <------< snip <------------ > > > > > > I've made tons of info available on my home page for you to look at (proc > > > files, vmlinux, System.map, original panic message, etc. > > > > > > http://oasis.frogfoot.net/sfq/ > > > > -- > > http://www.PowerDNS.com Open source, database driven DNS Software > > http://lartc.org Linux Advanced Routing & Traffic Control HOWTO > > http://netherlabs.nl Consulting > > > > - > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ -- Regards Abraham Reporter (to Mahatma Gandhi): Mr Gandhi, what do you think of Western Civilization? Gandhi: I think it would be a good idea. ___________________________________________________ Abraham vd Merwe - Frogfoot Networks CC 9 Kinnaird Court, 33 Main Street, Newlands, 7700 Phone: +27 21 686 1674 Cell: +27 82 565 4451 Http: http://www.frogfoot.net/ Email: abz@frogfoot.net From rddunlap@osdl.org Tue Mar 11 11:58:29 2003 Received: with ECARTIS (v1.0.0; list netdev); Tue, 11 Mar 2003 11:58:33 -0800 (PST) Received: from mail.osdl.org (air-2.osdl.org [65.172.181.6]) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h2BJwSq9007519 for ; Tue, 11 Mar 2003 11:58:29 -0800 Received: from dragon.pdx.osdl.net (dragon.pdx.osdl.net [172.20.1.27]) by mail.osdl.org (8.11.6/8.11.6) with SMTP id h2BJwQG28469; Tue, 11 Mar 2003 11:58:26 -0800 Date: Tue, 11 Mar 2003 11:56:12 -0800 From: "Randy.Dunlap" To: linux-net@vger.kernel.org Cc: netdev@oss.sgi.com Subject: updating MIBs/statistics Message-Id: <20030311115612.73821921.rddunlap@osdl.org> Organization: OSDL X-Mailer: Sylpheed version 0.8.6 (GTK+ 1.2.10; i586-pc-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 1930 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: rddunlap@osdl.org Precedence: bulk X-list: netdev Content-Length: 412 Lines: 16 Hi, I'm looking into doing some MIB updates, such as making the IPv6 stats table be on a per interface basis (per RFC 2465) and adding a UDP listener table and TCP connection table as well as some other IPv6 MIB requirements. Has anyone else tackled this? Assuming that the patches for this are in Linux style, are there any issues with doing updates like these? Like non-technical issues? Thanks, -- ~Randy From abz@oasis.frogfoot.net Tue Mar 11 13:20:29 2003 Received: with ECARTIS (v1.0.0; list netdev); Tue, 11 Mar 2003 13:20:39 -0800 (PST) Received: from oasis.frogfoot.net (oasis.frogfoot.net [66.8.28.51]) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h2BLKNq9011281 for ; Tue, 11 Mar 2003 13:20:27 -0800 Received: (qmail 8925 invoked by uid 1001); 11 Mar 2003 21:19:02 -0000 Date: Tue, 11 Mar 2003 23:19:02 +0200 From: Abraham van der Merwe To: kuznet@ms2.inr.ac.ru Cc: devik@cdi.cz, ahu@ds9a.nl, linux-kernel@vger.kernel.org, david@uninetwork.co.za, netdev@oss.sgi.com Subject: Re: kernel panic: bug in sch_sfq.c Message-ID: <20030311211902.GA8699@oasis.frogfoot.net> Mail-Followup-To: kuznet@ms2.inr.ac.ru, devik@cdi.cz, ahu@ds9a.nl, linux-kernel@vger.kernel.org, david@uninetwork.co.za, netdev@oss.sgi.com References: <20030311155409.GB7641@oasis.frogfoot.net> <200303111608.TAA13074@sex.inr.ac.ru> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="BXVAT5kNtrzKuDFl" Content-Disposition: inline In-Reply-To: <200303111608.TAA13074@sex.inr.ac.ru> User-Agent: Mutt/1.3.28i Organization: Frogfoot Networks X-Operating-System: Debian GNU/Linux oasis 2.4.20-rc1 i686 X-GPG-Public-Key: http://oasis.frogfoot.net/pgpkeys/keys/frogfoot.gpg X-Uptime: 22:58:38 up 69 days, 10:25, 8 users, load average: 0.02, 0.01, 0.00 X-Edited-With-Muttmode: muttmail.sl - 2001-09-27 X-archive-position: 1931 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: abz@frogfoot.net Precedence: bulk X-list: netdev Content-Length: 2439 Lines: 101 --BXVAT5kNtrzKuDFl Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable Hi kuznet! > > Also, if I compile the kernel with all debugging enabled (CONFIG_DEBUG_= SLAB, > > etc) I can reliably trigger the BUG() on line 1263 in mm/slab.c >=20 > How does backtrace oops look? I didn't write down most of the BUG() panics, but here is one (unfortunately it doesn't have any QoS code in the stack trace): ------------< snip <------< snip <------< snip <------------ root@trillian:~/uni-qos# cat panic.txt c0192eb1 c0176c8c c017718c c0176afa c010810f c01082b3 c0105240 c0105240 c0105240 c0105240 c0105263 c01052d2 c0105000 c0105027 0f 0b ef 04 e0 87 1e c0 f7 c5 00 04 00 00 74 36 b8 a5 c2 0f EIP: 0010:c012642e ESP: c0221eb4 KERNEL BUG slab.c:1263 root@trillian:~/uni-qos# ------------< snip <------< snip <------< snip <------------ A quick objdump through the kernel's vmlinux image reveals, that the stack trace above looks as follows: ------------< snip <------< snip <------< snip <------------ c0192eb1 alloc_skb c0176c8c speedo_refill_rx_buf c017718c speedo_rx c0176afa speedo_interrupt c010810f handle_IRQ_event c01082b3 do_IRQ c0105240 default_idle c0105240 default_idle c0105240 default_idle c0105240 c0105263 default_idle c01052d2 cpu_idle c0105000 rest_init c0105027 rest_init ------------< snip <------< snip <------< snip <------------ It crashes when it hits BUG(); in slab.c: ------------< snip <------< snip <------< snip <------------ #if DEBUG if (cachep->flags & SLAB_POISON) if (kmem_check_poison_obj(cachep, objp)) BUG(); ------------< snip <------< snip <------< snip <------------ --=20 Regards Abraham Nothing is so often irretrievably missed as a daily opportunity. -- Ebner-Eschenbach ___________________________________________________ Abraham vd Merwe - Frogfoot Networks CC 9 Kinnaird Court, 33 Main Street, Newlands, 7700 Phone: +27 21 686 1674 Cell: +27 82 565 4451 Http: http://www.frogfoot.net/ Email: abz@frogfoot.net --BXVAT5kNtrzKuDFl Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.0.5 (GNU/Linux) Comment: For info see http://www.gnupg.org iD8DBQE+blLG0jJV70h31dERAiQ7AJ0b2vqj6ouXT7nlHnSGB2Y8JfLqawCfeKW7 8CmtViJv+1OGwWaCRYR/M+k= =l040 -----END PGP SIGNATURE----- --BXVAT5kNtrzKuDFl-- From rreddy@c.psc.edu Tue Mar 11 15:11:16 2003 Received: with ECARTIS (v1.0.0; list netdev); Tue, 11 Mar 2003 15:11:20 -0800 (PST) Received: from c.psc.edu (c.psc.edu [128.182.73.106]) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h2BNAZq9015153 for ; Tue, 11 Mar 2003 15:11:16 -0800 Received: by c.psc.edu for NETDEV@OSS.SGI.COM; Tue, 11 Mar 2003 18:10:35 -0500 Date: Tue, 11 Mar 2003 18:10:35 -0500 From: "Raghurama 'REDDY'" Reply-To: rreddy@psc.edu To: NETDEV@OSS.SGI.COM CC: RREDDY@vms.psc.edu Message-Id: <03031118103534.2221577a.8921380@psc.edu> Subject: Bug or feature: raw sockets ignores IP_DF when packet is bigger than pmtu X-archive-position: 1932 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: rreddy@psc.edu Precedence: bulk X-list: netdev Content-Length: 1611 Lines: 52 Please let me know if this is the expected behavior: I am running a linux 2.4.20 system. The socket type is SOCK_RAW and the protocol is IPPROTO_RAW I am filling IP header myself. IP_DF is set in the header. I have experimented with and without IP_HDRINCL, and it did not make difference. It appears that on Linux, if the protocol and the socket are RAW, it does assume the header is included. For the test: The interface MTU is 4470 Next hop is router with an MTU of 1500 Packet size being sent out is about 2000 bytes What is observed on running a program like traceroute with "-M" option is: The first time I run it, I do get "Fragmentation required" message as expected. We can also observe in the tcpdump output that DF is set. If I run the test again immediately, we can see in the tcpdump output on the outging interface that IP fragments the message to 1500 bytes and sends them out with out setting the DF bit. This is because the route cache has path MTU stored as 1500. If I wait for some (until cache expires), or explicitly flush the cache with: echo 1 > /proc/sys/net/ipv4/route/flush and rerun the test, it works as expected and returns "Fragmentation required" packets. So the conjecture is that IP on the "host" fragments the packets if it knows the path MTU is not large enough to send the packet with out fragmentation (even when DF bit is set) Apparently this is consistent with the IPv6 spec which says that the routers can not fragment packets, and that hosts may. Thanks! --rr From seong@etri.re.kr Tue Mar 11 17:01:05 2003 Received: with ECARTIS (v1.0.0; list netdev); Tue, 11 Mar 2003 17:01:11 -0800 (PST) Received: from cms1.etri.re.kr (cms1.etri.re.kr [129.254.16.11]) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h2C113q9020564 for ; Tue, 11 Mar 2003 17:01:05 -0800 Received: from SEONG ([129.254.172.40]) by cms1.etri.re.kr with SMTP (Microsoft Exchange Internet Mail Service Version 5.5.2653.13) id GRNPMBYW; Wed, 12 Mar 2003 10:00:43 +0900 Message-ID: <001801c2e833$0d44e430$28acfe81@seong> From: "Seong Moon" To: Subject: arp cache deletion and netlink ? Date: Wed, 12 Mar 2003 10:02:28 +0900 MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 5.50.4920.2300 X-MimeOLE: Produced By Microsoft MimeOLE V5.50.4920.2300 X-archive-position: 1933 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: seong@etri.re.kr Precedence: bulk X-list: netdev Content-Length: 171 Lines: 8 Hi there. Can I monitor the deletion of an arp cache entry through netlink interface ? I looked into kernel source, then I found there is no implementation about that. From davem@redhat.com Tue Mar 11 23:40:12 2003 Received: with ECARTIS (v1.0.0; list netdev); Tue, 11 Mar 2003 23:40:15 -0800 (PST) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h2C7dVq9012071 for ; Tue, 11 Mar 2003 23:40:12 -0800 Received: from localhost (IDENT:davem@localhost.localdomain [127.0.0.1]) by pizda.ninka.net (8.9.3/8.9.3) with ESMTP id XAA06207; Tue, 11 Mar 2003 23:38:43 -0800 Date: Tue, 11 Mar 2003 23:38:43 -0800 (PST) Message-Id: <20030311.233843.97559124.davem@redhat.com> To: yoshfuji@linux-ipv6.org Cc: kuznet@ms2.inr.ac.ru, netdev@oss.sgi.com, usagi@linux-ipv6.org Subject: Re: [PATCH] IPSEC: typo in xfrm_sk_clone_policy() From: "David S. Miller" In-Reply-To: <20030312.161749.123173528.yoshfuji@linux-ipv6.org> References: <20030312.161749.123173528.yoshfuji@linux-ipv6.org> X-FalunGong: Information control. X-Mailer: Mew version 2.1 on Emacs 21.1 / Mule 5.0 (SAKAKI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=iso-2022-jp Content-Transfer-Encoding: 7bit X-archive-position: 1934 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev Content-Length: 316 Lines: 9 From: YOSHIFUJI Hideaki / $B5HF#1QL@(B Date: Wed, 12 Mar 2003 16:17:49 +0900 (JST) I think following patch fixes a typo in xfrm_sk_clone_policy() which results in infinite loop if sk->policy[0] or sk->policy[1] is true. Patch is for 2.5.64. Patch applied, thank you. From yoshfuji@linux-ipv6.org Wed Mar 12 00:11:08 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 12 Mar 2003 00:11:13 -0800 (PST) Received: from yue.hongo.wide.ad.jp (yue.hongo.wide.ad.jp [203.178.139.94]) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h2C8B6q9012767 for ; Wed, 12 Mar 2003 00:11:08 -0800 Received: from localhost (localhost [127.0.0.1]) by yue.hongo.wide.ad.jp (8.12.3+3.5Wbeta/8.12.3/Debian-5) with ESMTP id h2C7HnUl019287; Wed, 12 Mar 2003 16:17:50 +0900 Date: Wed, 12 Mar 2003 16:17:49 +0900 (JST) Message-Id: <20030312.161749.123173528.yoshfuji@linux-ipv6.org> To: davem@redhat.com, kuznet@ms2.inr.ac.ru CC: netdev@oss.sgi.com, usagi@linux-ipv6.org Subject: [PATCH] IPSEC: typo in xfrm_sk_clone_policy() From: YOSHIFUJI Hideaki / =?iso-2022-jp?B?GyRCNUhGIzFRTEAbKEI=?= Organization: USAGI Project X-URL: http://www.yoshifuji.org/%7Ehideaki/ X-Fingerprint: 90 22 65 EB 1E CF 3A D1 0B DF 80 D8 48 07 F8 94 E0 62 0E EA X-PGP-Key-URL: http://www.yoshifuji.org/%7Ehideaki/hideaki@yoshifuji.org.asc X-Face: "5$Al-.M>NJ%a'@hhZdQm:."qn~PA^gq4o*>iCFToq*bAi#4FRtx}enhuQKz7fNqQz\BYU] $~O_5m-9'}MIs`XGwIEscw;e5b>n"B_?j/AkL~i/MEaZBLP X-Mailer: Mew version 2.2 on Emacs 20.7 / Mule 4.1 (AOI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 1935 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: yoshfuji@linux-ipv6.org Precedence: bulk X-list: netdev Content-Length: 844 Lines: 29 Hello. I think following patch fixes a typo in xfrm_sk_clone_policy() which results in infinite loop if sk->policy[0] or sk->policy[1] is true. Patch is for 2.5.64. Thanks. Index: include/net/xfrm.h =================================================================== RCS file: /cvsroot/usagi/usagi-backport/linux25/include/net/xfrm.h,v retrieving revision 1.1.1.7 diff -u -r1.1.1.7 xfrm.h --- include/net/xfrm.h 16 Feb 2003 04:09:06 -0000 1.1.1.7 +++ include/net/xfrm.h 12 Mar 2003 07:06:20 -0000 @@ -335,7 +335,7 @@ static inline int xfrm_sk_clone_policy(struct sock *sk) { if (unlikely(sk->policy[0] || sk->policy[1])) - return xfrm_sk_clone_policy(sk); + return __xfrm_sk_clone_policy(sk); return 0; } -- Hideaki YOSHIFUJI @ USAGI Project GPG FP: 9022 65EB 1ECF 3AD1 0BDF 80D8 4807 F894 E062 0EEA From maxiu@man.poznan.pl Wed Mar 12 04:28:09 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 12 Mar 2003 04:28:16 -0800 (PST) Received: from rose.man.poznan.pl (rose.man.poznan.pl [150.254.173.3]) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h2CCS6q9030640 for ; Wed, 12 Mar 2003 04:28:08 -0800 Received: from rose.man.poznan.pl (localhost [127.0.0.1]) by rose.man.poznan.pl (8.12.5/8.12.5/auth/ldap/milter/tls) with ESMTP id h2CCHlUM018534 (version=TLSv1/SSLv3 cipher=EDH-RSA-DES-CBC3-SHA bits=168 verify=NO) for ; Wed, 12 Mar 2003 13:17:48 +0100 (CET) Received: from localhost (maxiu@localhost) by rose.man.poznan.pl (8.12.5/8.12.5/Submit) with ESMTP id h2CCHl5O018530 for ; Wed, 12 Mar 2003 13:17:47 +0100 (CET) X-Authentication-Warning: rose.man.poznan.pl: maxiu owned process doing -bs Date: Wed, 12 Mar 2003 13:17:47 +0100 (CET) From: Marcin Kaminski To: netdev@oss.sgi.com Subject: socket(PF_INET6, SOCK_RAW, IPPROTO_ICMPV6) Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=iso-8859-2 X-RAVMilter-Version: 8.4.1(snapshot 20020919) (rose) Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from QUOTED-PRINTABLE to 8bit by oss.sgi.com id h2CCS6q9030640 X-archive-position: 1936 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: maxiu@man.poznan.pl Precedence: bulk X-list: netdev Content-Length: 777 Lines: 24 Hi I've found, that on my system (2.2.22) RAW sockets for IPv6 works different than for IPv4. When I create socket like: interfaceSocket = socket(PF_INET, SOCK_RAW, IPPROTO_ICMP); I receive ICMPv4 packets with IP header, but when I use interfaceSocket = socket(PF_INET6, SOCK_RAW, IPPROTO_ICMPV6); I receive ICMPv6 packets WITHOUT IPv6 header. What should I do in order to get full packet? Man pages of raw(7) tell: For receiving the IP header is always included in the packet. But it is not true for IPv6 :( With regards - Marcin Kaminski --------------------------------- maxiu - --- software developer ------------------- 6net project --- ----- network administrator -------- Best Group admin ----- ------- Poznañ Supercomputing and Networking Center ------- From yoshfuji@linux-ipv6.org Wed Mar 12 04:39:43 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 12 Mar 2003 04:39:46 -0800 (PST) Received: from yue.hongo.wide.ad.jp (yue.hongo.wide.ad.jp [203.178.139.94]) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h2CCd2q9031045 for ; Wed, 12 Mar 2003 04:39:42 -0800 Received: from localhost (localhost [127.0.0.1]) by yue.hongo.wide.ad.jp (8.12.3+3.5Wbeta/8.12.3/Debian-5) with ESMTP id h2CCd8Ul021264; Wed, 12 Mar 2003 21:39:08 +0900 Date: Wed, 12 Mar 2003 21:39:03 +0900 (JST) Message-Id: <20030312.213903.04850857.yoshfuji@linux-ipv6.org> To: maxiu@man.poznan.pl Cc: netdev@oss.sgi.com Subject: Re: socket(PF_INET6, SOCK_RAW, IPPROTO_ICMPV6) From: YOSHIFUJI Hideaki / =?iso-2022-jp?B?GyRCNUhGIzFRTEAbKEI=?= In-Reply-To: References: Organization: USAGI Project X-URL: http://www.yoshifuji.org/%7Ehideaki/ X-Fingerprint: 90 22 65 EB 1E CF 3A D1 0B DF 80 D8 48 07 F8 94 E0 62 0E EA X-PGP-Key-URL: http://www.yoshifuji.org/%7Ehideaki/hideaki@yoshifuji.org.asc X-Face: "5$Al-.M>NJ%a'@hhZdQm:."qn~PA^gq4o*>iCFToq*bAi#4FRtx}enhuQKz7fNqQz\BYU] $~O_5m-9'}MIs`XGwIEscw;e5b>n"B_?j/AkL~i/MEaZBLP X-Mailer: Mew version 2.2 on Emacs 20.7 / Mule 4.1 (AOI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 1937 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: yoshfuji@linux-ipv6.org Precedence: bulk X-list: netdev Content-Length: 1006 Lines: 29 In article (at Wed, 12 Mar 2003 13:17:47 +0100 (CET)), Marcin Kaminski says: > When I create socket like: > interfaceSocket = socket(PF_INET, SOCK_RAW, IPPROTO_ICMP); > I receive ICMPv4 packets with IP header, but when I use > interfaceSocket = socket(PF_INET6, SOCK_RAW, IPPROTO_ICMPV6); > I receive ICMPv6 packets WITHOUT IPv6 header. It is because of the specification (RFC2292). > What should I do in order to get full packet? > Man pages of raw(7) tell: There're no portable way to send/receive whole packet including IPv6 header (and possible extension header(s)). > What should I do in order to get full packet? > Man pages of raw(7) tell: > > For receiving the IP header is always included in the packet. > > But it is not true for IPv6 :( It is an error of that manpage. -- Hideaki YOSHIFUJI @ USAGI Project GPG FP: 9022 65EB 1ECF 3AD1 0BDF 80D8 4807 F894 E062 0EEA From maxiu@man.poznan.pl Wed Mar 12 04:52:44 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 12 Mar 2003 04:52:46 -0800 (PST) Received: from rose.man.poznan.pl (rose.man.poznan.pl [150.254.173.3]) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h2CCq3q9032104 for ; Wed, 12 Mar 2003 04:52:44 -0800 Received: from rose.man.poznan.pl (localhost [127.0.0.1]) by rose.man.poznan.pl (8.12.5/8.12.5/auth/ldap/milter/tls) with ESMTP id h2CCprUM023964 (version=TLSv1/SSLv3 cipher=EDH-RSA-DES-CBC3-SHA bits=168 verify=NO); Wed, 12 Mar 2003 13:51:57 +0100 (CET) Received: from localhost (maxiu@localhost) by rose.man.poznan.pl (8.12.5/8.12.5/Submit) with ESMTP id h2CCprbK023960; Wed, 12 Mar 2003 13:51:53 +0100 (CET) X-Authentication-Warning: rose.man.poznan.pl: maxiu owned process doing -bs Date: Wed, 12 Mar 2003 13:51:53 +0100 (CET) From: Marcin Kaminski To: YOSHIFUJI Hideaki / =?iso-2022-jp?B?GyRCNUhGIzFRTEAbKEI=?= cc: netdev@oss.sgi.com Subject: Re: socket(PF_INET6, SOCK_RAW, IPPROTO_ICMPV6) In-Reply-To: <20030312.213903.04850857.yoshfuji@linux-ipv6.org> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=iso-8859-2 X-RAVMilter-Version: 8.4.1(snapshot 20020919) (rose) X-archive-position: 1938 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: maxiu@man.poznan.pl Precedence: bulk X-list: netdev Content-Length: 444 Lines: 13 On Wed, 12 Mar 2003, YOSHIFUJI Hideaki / [iso-2022-jp] $B5HF#1QL@(B wrote: > > What should I do in order to get full packet? > > There're no portable way to send/receive whole packet including > IPv6 header (and possible extension header(s)). OK. Is there simplier method of obtaining source and destaination address of icmp packet (the only informations I need from ipv6 header) than setting IPV6_PKTINFO and receiving them with recvmsg? From yoshfuji@linux-ipv6.org Wed Mar 12 07:33:33 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 12 Mar 2003 07:33:37 -0800 (PST) Received: from yue.hongo.wide.ad.jp (yue.hongo.wide.ad.jp [203.178.139.94]) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h2CFXWq9029353 for ; Wed, 12 Mar 2003 07:33:33 -0800 Received: from localhost (localhost [127.0.0.1]) by yue.hongo.wide.ad.jp (8.12.3+3.5Wbeta/8.12.3/Debian-5) with ESMTP id h2CFXfUl022095; Thu, 13 Mar 2003 00:33:42 +0900 Date: Thu, 13 Mar 2003 00:33:41 +0900 (JST) Message-Id: <20030313.003341.49620358.yoshfuji@linux-ipv6.org> To: maxiu@man.poznan.pl Cc: netdev@oss.sgi.com Subject: Re: socket(PF_INET6, SOCK_RAW, IPPROTO_ICMPV6) From: YOSHIFUJI Hideaki / =?iso-2022-jp?B?GyRCNUhGIzFRTEAbKEI=?= In-Reply-To: References: <20030312.213903.04850857.yoshfuji@linux-ipv6.org> Organization: USAGI Project X-URL: http://www.yoshifuji.org/%7Ehideaki/ X-Fingerprint: 90 22 65 EB 1E CF 3A D1 0B DF 80 D8 48 07 F8 94 E0 62 0E EA X-PGP-Key-URL: http://www.yoshifuji.org/%7Ehideaki/hideaki@yoshifuji.org.asc X-Face: "5$Al-.M>NJ%a'@hhZdQm:."qn~PA^gq4o*>iCFToq*bAi#4FRtx}enhuQKz7fNqQz\BYU] $~O_5m-9'}MIs`XGwIEscw;e5b>n"B_?j/AkL~i/MEaZBLP X-Mailer: Mew version 2.2 on Emacs 20.7 / Mule 4.1 (AOI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 1939 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: yoshfuji@linux-ipv6.org Precedence: bulk X-list: netdev Content-Length: 484 Lines: 10 In article (at Wed, 12 Mar 2003 13:51:53 +0100 (CET)), Marcin Kaminski says: > Is there simplier method of obtaining source and destaination address of > icmp packet (the only informations I need from ipv6 header) than setting > IPV6_PKTINFO and receiving them with recvmsg? No; recvmsg() and IPV6_PKTINFO socket options is the SIMPLE way for obtaining source and destination address. --yoshfuji From ps41@hotmail.com Wed Mar 12 15:04:17 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 12 Mar 2003 15:04:20 -0800 (PST) Received: from hotmail.com (bay1-f208.bay1.hotmail.com [65.54.245.208]) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h2CN4Gq9004839 for ; Wed, 12 Mar 2003 15:04:16 -0800 Received: from mail pickup service by hotmail.com with Microsoft SMTPSVC; Wed, 12 Mar 2003 15:04:11 -0800 Received: from 136.162.88.194 by by1fd.bay1.hotmail.msn.com with HTTP; Wed, 12 Mar 2003 23:04:10 GMT X-Originating-IP: [136.162.88.194] From: "Parag Sharma" To: netdev@oss.sgi.com Subject: faulting in user pages for zero copy transmit Date: Wed, 12 Mar 2003 14:04:10 -0900 Mime-Version: 1.0 Content-Type: text/plain; format=flowed Message-ID: X-OriginalArrivalTime: 12 Mar 2003 23:04:11.0102 (UTC) FILETIME=[B16637E0:01C2E8EB] X-archive-position: 1940 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: ps41@hotmail.com Precedence: bulk X-list: netdev Content-Length: 473 Lines: 22 Hi, I am looking at 2.4.19-ac4 source code and trying to understand the zero copy transmit. I have been trying to figure out how/where are the user pages, that might have been swapped out, brought into memory prior to DMA? I would appreciate any help in figuring this one out. thanks Parag _________________________________________________________________ Help STOP SPAM with the new MSN 8 and get 2 months FREE* http://join.msn.com/?page=features/junkmail From maxiu@man.poznan.pl Thu Mar 13 02:41:10 2003 Received: with ECARTIS (v1.0.0; list netdev); Thu, 13 Mar 2003 02:41:21 -0800 (PST) Received: from rose.man.poznan.pl (rose.man.poznan.pl [150.254.173.3]) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h2DAf7q9023691 for ; Thu, 13 Mar 2003 02:41:09 -0800 Received: from rose.man.poznan.pl (localhost [127.0.0.1]) by rose.man.poznan.pl (8.12.5/8.12.5/auth/ldap/milter/tls) with ESMTP id h2DAeuUM021469 (version=TLSv1/SSLv3 cipher=EDH-RSA-DES-CBC3-SHA bits=168 verify=NO); Thu, 13 Mar 2003 11:40:57 +0100 (CET) Received: from localhost (maxiu@localhost) by rose.man.poznan.pl (8.12.5/8.12.5/Submit) with ESMTP id h2DAetTY021466; Thu, 13 Mar 2003 11:40:56 +0100 (CET) X-Authentication-Warning: rose.man.poznan.pl: maxiu owned process doing -bs Date: Thu, 13 Mar 2003 11:40:55 +0100 (CET) From: Marcin Kaminski To: YOSHIFUJI Hideaki / =?iso-2022-jp?B?GyRCNUhGIzFRTEAbKEI=?= cc: netdev@oss.sgi.com Subject: Re: socket(PF_INET6, SOCK_RAW, IPPROTO_ICMPV6) In-Reply-To: <20030313.003341.49620358.yoshfuji@linux-ipv6.org> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=iso-8859-2 X-RAVMilter-Version: 8.4.1(snapshot 20020919) (rose) X-archive-position: 1943 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: maxiu@man.poznan.pl Precedence: bulk X-list: netdev Content-Length: 728 Lines: 17 On Thu, 13 Mar 2003, YOSHIFUJI Hideaki / [iso-2022-jp] $B5HF#1QL@(B wrote: > No; recvmsg() and IPV6_PKTINFO socket options is the SIMPLE way for > obtaining source and destination address. OK, it works very well, but is there a way which is common to IPv4 and IPv6 to get ICMP packets? Obtaining addresses is common to both protocols, but now I get IP header + ICMP header from IPv4 sockets, and ICMP header from IPv6 sockets, so I must process them differently (basic ICMP packets have the same structure, only different values so routines for ICMP could be universal). You wrote that there is no portable way to obtain IPv6 + ICMPv6, so is there a way to portable obtain only ICMPv4 (without IPv4 header)? With regards From anton@samba.org Thu Mar 13 11:25:50 2003 Received: with ECARTIS (v1.0.0; list netdev); Thu, 13 Mar 2003 11:26:00 -0800 (PST) Received: from lists.samba.org (dp.samba.org [66.70.73.150]) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h2DJP9q9003141 for ; Thu, 13 Mar 2003 11:25:50 -0800 Received: by lists.samba.org (Postfix, from userid 504) id 26CED2C053; Thu, 13 Mar 2003 19:25:09 +0000 (GMT) Date: Fri, 14 Mar 2003 06:24:57 +1100 From: Anton Blanchard To: netdev@oss.sgi.com Cc: davem@redhat.com, akpm@digeo.com, bcrl@redhat.com Subject: recvmsg compat code Message-ID: <20030313192457.GA3279@krispykreme> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.5.3i X-archive-position: 1945 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: anton@samba.org Precedence: bulk X-list: netdev Content-Length: 1522 Lines: 55 Hi, The recent clean up of the duplicated recvmsg code (which I was happy to see go in) broke my sshd. It turns out compat handling of fd passing is broken. We were looking for the MSG_CMSG_COMPAT flag in msg->msg_flags. I was just about to pass down flags into the two problem functions, however put_cmsg is called from a bunch of places. Any thoughts? Anton ===== net/core/scm.c 1.6 vs edited ===== --- 1.6/net/core/scm.c Fri Mar 7 06:06:44 2003 +++ edited/net/core/scm.c Fri Mar 14 06:18:03 2003 @@ -165,14 +165,15 @@ return err; } -int put_cmsg(struct msghdr * msg, int level, int type, int len, void *data) +int put_cmsg(struct msghdr * msg, int level, int type, int len, void *data, + unsigned int flags) { struct cmsghdr *cm = (struct cmsghdr*)msg->msg_control; struct cmsghdr cmhdr; int cmlen = CMSG_LEN(len); int err; - if (MSG_CMSG_COMPAT & msg->msg_flags) + if (MSG_CMSG_COMPAT & flags) return put_cmsg_compat(msg, level, type, len, data); if (cm==NULL || msg->msg_controllen < sizeof(*cm)) { @@ -200,7 +201,8 @@ return err; } -void scm_detach_fds(struct msghdr *msg, struct scm_cookie *scm) +void scm_detach_fds(struct msghdr *msg, struct scm_cookie *scm, + unsigned long flags) { struct cmsghdr *cm = (struct cmsghdr*)msg->msg_control; @@ -210,7 +212,7 @@ int *cmfptr; int err = 0, i; - if (MSG_CMSG_COMPAT & msg->msg_flags) + if (MSG_CMSG_COMPAT & flags) return scm_detach_fds_compat(msg, scm); if (msg->msg_controllen > sizeof(struct cmsghdr)) From mcr@sandelman.ottawa.on.ca Sun Mar 16 20:25:58 2003 Received: with ECARTIS (v1.0.0; list netdev); Sun, 16 Mar 2003 20:26:13 -0800 (PST) Received: from noxmail.sandelman.ottawa.on.ca ([192.139.46.78]) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h2H4Ppq9001931 for ; Sun, 16 Mar 2003 20:25:58 -0800 Received: from sandelman.ottawa.on.ca (wl-129-246.wireless.ietf56.ietf.org [130.129.129.246]) by noxmail.sandelman.ottawa.on.ca (8.11.6/8.11.6) with ESMTP id h2H4PRD04384 (using TLSv1/SSLv3 with cipher EDH-RSA-DES-CBC3-SHA (168 bits) verified OK) for ; Sun, 16 Mar 2003 23:25:29 -0500 (EST) Received: from marajade.sandelman.ottawa.on.ca (marajade [127.0.0.1] (may be forged)) by sandelman.ottawa.on.ca (8.12.3/8.12.3/Debian -4) with ESMTP id h2H4PPWm006365 (version=TLSv1/SSLv3 cipher=EDH-RSA-DES-CBC3-SHA bits=168 verify=NO) for ; Sun, 16 Mar 2003 20:25:26 -0800 Received: from marajade.sandelman.ottawa.on.ca (mcr@localhost) by marajade.sandelman.ottawa.on.ca (8.12.3/8.12.3/Debian-5) with ESMTP id h2H4PO8h006362 for ; Sun, 16 Mar 2003 20:25:25 -0800 Message-Id: <200303170425.h2H4PO8h006362@marajade.sandelman.ottawa.on.ca> To: netdev@oss.sgi.com Subject: BUG: 2.4.21-pre5 changes network scan order Mime-Version: 1.0 (generated by tm-edit 1.8) Content-Type: text/plain; charset=US-ASCII Date: Sun, 16 Mar 2003 20:25:24 -0800 From: Michael Richardson X-archive-position: 1953 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: mcr@sandelman.ottawa.on.ca Precedence: bulk X-list: netdev Content-Length: 2928 Lines: 67 -----BEGIN PGP SIGNED MESSAGE----- We installed 2.4.21-pre5 on a system that had 2.4.20 on it before. A) The network scan order changes. It used to be: eth0 = Intel (on motherboard) eth1 = DP8381x eth2 = DS21140 which WAS the PCI BIOS order. 2.4.21-pre5 does NOT get it in the PCI bios order. B) Since we all agree that it is unacceptable to make a change like this in the production stream, I expect that this is a bug. dhs-[~] root 31 #lspci 00:00.0 Host bridge: Intel Corp. 440BX/ZX/DX - 82443BX/ZX/DX Host bridge (rev 0) 00:01.0 PCI bridge: Intel Corp. 440BX/ZX/DX - 82443BX/ZX/DX AGP bridge (rev 03) 00:07.0 ISA bridge: Intel Corp. 82371AB/EB/MB PIIX4 ISA (rev 02) 00:07.1 IDE interface: Intel Corp. 82371AB/EB/MB PIIX4 IDE (rev 01) 00:07.2 USB Controller: Intel Corp. 82371AB/EB/MB PIIX4 USB (rev 01) 00:07.3 Bridge: Intel Corp. 82371AB/EB/MB PIIX4 ACPI (rev 02) 00:0d.0 Ethernet controller: Intel Corp. 82559ER (rev 09) 00:11.0 Ethernet controller: National Semiconductor Corporation DP83815 (MacPhyr 00:13.0 Ethernet controller: Digital Equipment Corporation DECchip 21140 [Faste) 01:00.0 VGA compatible controller: S3 Inc. 86c368 [Trio 3D/2X] (rev 02) tulip0: EEPROM default media type Autosense. tulip0: Index #0 - Media MII (#11) described by a 21140 MII PHY (1) block. tulip0: MII transceiver #0 config 1000 status 7809 advertising 01e1. divert: allocating divert_blk for eth0 eth0: Digital DS21140 Tulip rev 34 at 0xec00, 00:40:05:A3:52:E6, IRQ 10. eepro100.c:v1.09j-t 9/29/99 Donald Becker http://www.scyld.com/network/eepro100.html eepro100.c: $Revision: 1.36 $ 2000/11/17 Modified by Andrey V. Savochkin and others PCI: Found IRQ 9 for device 00:0d.0 divert: allocating divert_blk for eth1 eth1: Intel Corp. 82559ER, 00:60:EF:11:4E:5A, IRQ 9. Receiver lock-up bug exists -- enabling work-around. Board assembly 645520-034, Physical connectors present: RJ45 Primary interface chip DP83840 PHY #1. DP83840 specific setup, setting register 23 to 0422. General self-test: passed. Serial sub-system self-test: passed. Internal registers self-test: passed. ROM checksum self-test: passed (0xdbd8681d). Receiver lock-up workaround activated. natsemi dp8381x driver, version 1.07+LK1.0.17, Sep 27, 2002 originally by Donald Becker http://www.scyld.com/network/natsemi.html 2.4.x kernel port by Jeff Garzik, Tjeerd Mulder PCI: Found IRQ 9 for device 00:11.0 divert: allocating divert_blk for eth2 eth2: NatSemi DP8381[56] at 0xe080f000, 00:a0:cc:a1:fd:84, IRQ 9. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.0.7 (GNU/Linux) Comment: Finger me for keys iQCVAwUBPnVOI4qHRg3pndX9AQEY5QP/Uu0jvlmLSdH94lyn3yln0Psszc3HjTw/ oc3CiuuIFTlZ7brJ/IuBuGjCrgS31+oaOUMq+xgqQDlXL+Pj9Xw3Q8llphjyWTKz M15s2If2fTAVTYDuKa9WMGBT8HfmSDrE+V/2UulsG1PivUyztID8VVmqeDECctL0 nivCqDooFV4= =mUBg -----END PGP SIGNATURE----- From hadi@cyberus.ca Sun Mar 16 20:43:46 2003 Received: with ECARTIS (v1.0.0; list netdev); Sun, 16 Mar 2003 20:43:52 -0800 (PST) Received: from mx03.cyberus.ca (mx03.cyberus.ca [216.191.240.24]) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h2H4hiq9003158 for ; Sun, 16 Mar 2003 20:43:45 -0800 Received: from shell.cyberus.ca ([216.191.240.114]) by mx03.cyberus.ca with esmtp (Exim 4.10) id 18umTP-0004hU-00; Sun, 16 Mar 2003 23:43:43 -0500 Received: from shell.cyberus.ca (localhost.cyberus.ca [127.0.0.1]) by shell.cyberus.ca (8.12.6/8.12.6) with ESMTP id h2H4hCqu009296; Sun, 16 Mar 2003 23:43:12 -0500 (EST) (envelope-from hadi@cyberus.ca) Received: from localhost (hadi@localhost) by shell.cyberus.ca (8.12.6/8.12.6/Submit) with ESMTP id h2H4hBTQ009293; Sun, 16 Mar 2003 23:43:11 -0500 (EST) (envelope-from hadi@cyberus.ca) X-Authentication-Warning: shell.cyberus.ca: hadi owned process doing -bs Date: Sun, 16 Mar 2003 23:43:10 -0500 (EST) From: jamal To: Michael Richardson cc: netdev@oss.sgi.com, Marcelo Tosatti Subject: Re: BUG: 2.4.21-pre5 changes network scan order In-Reply-To: <200303170425.h2H4PO8j006362@marajade.sandelman.ottawa.on.ca> Message-ID: <20030316233446.T9241@shell.cyberus.ca> References: <200303170425.h2H4PO8j006362@marajade.sandelman.ottawa.on.ca> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-archive-position: 1954 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: hadi@cyberus.ca Precedence: bulk X-list: netdev Content-Length: 3533 Lines: 88 This is definetly unacceptable behavior. Did driverfs sneak into 2.4 ? cheers, jamal On Sun, 16 Mar 2003, Michael Richardson wrote: > ------- Blind-Carbon-Copy > > To: netdev@oss.sgi.com > Subject: BUG: 2.4.21-pre5 changes network scan order > Mime-Version: 1.0 (generated by tm-edit 1.8) > Content-Type: text/plain; charset=US-ASCII > Date: Sun, 16 Mar 2003 20:25:24 -0800 > From: Michael Richardson > > - -----BEGIN PGP SIGNED MESSAGE----- > > > We installed 2.4.21-pre5 on a system that had 2.4.20 on it before. > A) The network scan order changes. It used to be: > eth0 = Intel (on motherboard) > eth1 = DP8381x > eth2 = DS21140 > > which WAS the PCI BIOS order. > > 2.4.21-pre5 does NOT get it in the PCI bios order. > > B) Since we all agree that it is unacceptable to make a change like this in > the production stream, I expect that this is a bug. > > > dhs-[~] root 31 #lspci > 00:00.0 Host bridge: Intel Corp. 440BX/ZX/DX - 82443BX/ZX/DX Host bridge (rev 0) > 00:01.0 PCI bridge: Intel Corp. 440BX/ZX/DX - 82443BX/ZX/DX AGP bridge (rev 03) > 00:07.0 ISA bridge: Intel Corp. 82371AB/EB/MB PIIX4 ISA (rev 02) > 00:07.1 IDE interface: Intel Corp. 82371AB/EB/MB PIIX4 IDE (rev 01) > 00:07.2 USB Controller: Intel Corp. 82371AB/EB/MB PIIX4 USB (rev 01) > 00:07.3 Bridge: Intel Corp. 82371AB/EB/MB PIIX4 ACPI (rev 02) > 00:0d.0 Ethernet controller: Intel Corp. 82559ER (rev 09) > 00:11.0 Ethernet controller: National Semiconductor Corporation DP83815 (MacPhyr > 00:13.0 Ethernet controller: Digital Equipment Corporation DECchip 21140 [Faste) > 01:00.0 VGA compatible controller: S3 Inc. 86c368 [Trio 3D/2X] (rev 02) > > > > tulip0: EEPROM default media type Autosense. > tulip0: Index #0 - Media MII (#11) described by a 21140 MII PHY (1) block. > tulip0: MII transceiver #0 config 1000 status 7809 advertising 01e1. > divert: allocating divert_blk for eth0 > eth0: Digital DS21140 Tulip rev 34 at 0xec00, 00:40:05:A3:52:E6, IRQ 10. > eepro100.c:v1.09j-t 9/29/99 Donald Becker http://www.scyld.com/network/eepro100.html > eepro100.c: $Revision: 1.36 $ 2000/11/17 Modified by Andrey V. Savochkin and others > PCI: Found IRQ 9 for device 00:0d.0 > divert: allocating divert_blk for eth1 > eth1: Intel Corp. 82559ER, 00:60:EF:11:4E:5A, IRQ 9. > Receiver lock-up bug exists -- enabling work-around. > Board assembly 645520-034, Physical connectors present: RJ45 > Primary interface chip DP83840 PHY #1. > DP83840 specific setup, setting register 23 to 0422. > General self-test: passed. > Serial sub-system self-test: passed. > Internal registers self-test: passed. > ROM checksum self-test: passed (0xdbd8681d). > Receiver lock-up workaround activated. > natsemi dp8381x driver, version 1.07+LK1.0.17, Sep 27, 2002 > originally by Donald Becker > http://www.scyld.com/network/natsemi.html > 2.4.x kernel port by Jeff Garzik, Tjeerd Mulder > PCI: Found IRQ 9 for device 00:11.0 > divert: allocating divert_blk for eth2 > eth2: NatSemi DP8381[56] at 0xe080f000, 00:a0:cc:a1:fd:84, IRQ 9. > - -----BEGIN PGP SIGNATURE----- > Version: GnuPG v1.0.7 (GNU/Linux) > Comment: Finger me for keys > > iQCVAwUBPnVOI4qHRg3pndX9AQEY5QP/Uu0jvlmLSdH94lyn3yln0Psszc3HjTw/ > oc3CiuuIFTlZ7brJ/IuBuGjCrgS31+oaOUMq+xgqQDlXL+Pj9Xw3Q8llphjyWTKz > M15s2If2fTAVTYDuKa9WMGBT8HfmSDrE+V/2UulsG1PivUyztID8VVmqeDECctL0 > nivCqDooFV4= > =mUBg > - -----END PGP SIGNATURE----- > > ------- End of Blind-Carbon-Copy > From andwes-8@student.luth.se Mon Mar 17 06:06:56 2003 Received: with ECARTIS (v1.0.0; list netdev); Mon, 17 Mar 2003 06:07:03 -0800 (PST) Received: from gepetto.dc.luth.se (gepetto.dc.luth.se [130.240.42.40]) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h2HE6Dq9025447 for ; Mon, 17 Mar 2003 06:06:56 -0800 Received: from legion (tomten.campus.luth.se [130.240.221.171]) by gepetto.dc.luth.se (8.12.5/8.12.5) with SMTP id h2HE6CEZ002858 for ; Mon, 17 Mar 2003 15:06:12 +0100 (MET) Message-ID: <001101c2ec8e$5da12bf0$abddf082@legion> From: "Andreas Westin" To: Subject: bug ? Date: Mon, 17 Mar 2003 15:06:06 +0100 MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 6.00.2800.1106 X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2800.1106 X-archive-position: 1955 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: andwes-8@student.luth.se Precedence: bulk X-list: netdev Content-Length: 2767 Lines: 82 Hello, I've gotten this "crash" now several times with >=2.4.20. It usually happens within 1-2 days after bootup while downloading a larger amount of data at around 500k/s. Using gentoo linux on a dual 400mhz (celeron) abit bp6, i know this could be due to hardware problems but i thought that I might aswell report it if its a real bug. I hope that the info below is all that you need. /Andreas Network driver: 8139too Fast Ethernet driver 0.9.26 eth0: RealTek RTL8139 Fast Ethernet at 0xd8800000, 00:d0:70:01:0f:2f, IRQ 18 eth0: Identified 8139 chip type 'RTL-8139C' Unable to handle kernel paging request at virtual address e9272d39 c026c49f *pde = 00000000 Oops: 0000 CPU: 0 EIP: 0010:[] Not tainted Using defaults from ksymoops -t elf32-i386 -a i386 EFLAGS: 00210246 eax: 00008000 ebx: e9272d0d ecx: d561e3a0 edx: 00000000 esi: cad75f6c edi: cc47bec0 ebp: cad75f6c esp: cad75ee4 ds: 0018 es: 0018 ss: 0018 Process ncftp (pid: 31976, stackpage=cad75000) Stack: d561e3a0 cad75f6c 00008000 00000000 00000000 cad75efc 00000000 00008000 cad75f20 c022c008 cc47bec0 cad75f6c 00008000 00000000 cad75f20 00000000 00000000 00000000 00000000 00000000 00000049 cad74000 00000064 d65bb520 Call Trace: [] [] [] [] Code: ff 53 2c 85 c0 89 c2 78 07 8b 44 24 18 89 46 04 8b 5c 24 1c >>EIP; c026c49f <===== Trace; c022c008 Trace; c022c147 Trace; c013ce97 Trace; c010773f Code; c026c49f 00000000 <_EIP>: Code; c026c49f <===== 0: ff 53 2c call *0x2c(%ebx) <===== Code; c026c4a2 3: 85 c0 test %eax,%eax Code; c026c4a4 5: 89 c2 mov %eax,%edx Code; c026c4a6 7: 78 07 js 10 <_EIP+0x10> Code; c026c4a8 9: 8b 44 24 18 mov 0x18(%esp,1),%eax Code; c026c4ac d: 89 46 04 mov %eax,0x4(%esi) Code; c026c4af 10: 8b 5c 24 1c mov 0x1c(%esp,1),%ebx Linux hostname 2.4.21-pre5 #2 SMP Fri Feb 28 00:03:43 CET 2003 i686 Celeron (Mendocino) GenuineIntel GNU/Linux Gnu C 3.2.2 Gnu make 3.80 util-linux 2.11z mount 2.11z modutils 2.4.23 e2fsprogs 1.32 reiserfsprogs 3.6.3 Linux C Library 2.3.2 Dynamic linker (ldd) 2.3.2 Procps 3.1.6 Net-tools 1.60 Kbd 1.06 Sh-utils 2.0.15 From ralph@istop.com Mon Mar 17 19:43:26 2003 Received: with ECARTIS (v1.0.0; list netdev); Mon, 17 Mar 2003 19:43:32 -0800 (PST) Received: from smtp.istop.com (dci.doncaster.on.ca [66.11.168.194]) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h2I3hOq9018274 for ; Mon, 17 Mar 2003 19:43:25 -0800 Received: from ns.istop.com (ns.istop.com [66.11.168.199]) by smtp.istop.com (Postfix) with ESMTP id 187DD36A1E for ; Mon, 17 Mar 2003 22:43:24 -0500 (EST) Date: Mon, 17 Mar 2003 22:43:38 -0500 (EST) From: Ralph Doncaster Reply-To: ralph+d@istop.com To: netdev@oss.sgi.com Subject: Re: Linux router performance (3c59x) (fwd) Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) X-archive-position: 1957 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: ralph@istop.com Precedence: bulk X-list: netdev Content-Length: 6555 Lines: 167 I haven't heard from Jamal or Dave, so perhaps someone from this list has some wisdom to impart. Currently the box in question is running a 67% system load with ~40kpps. Here's the switch port stats that the 2 3c905cx cards are plugged into: 5 minute input rate 36143000 bits/sec, 8914 packets/sec 5 minute output rate 54338000 bits/sec, 10722 packets/sec - 5 minute input rate 50585000 bits/sec, 12445 packets/sec 5 minute output rate 34326000 bits/sec, 9596 packets/sec Ralph Doncaster principal, IStop.com ---------- Forwarded message ---------- Date: Mon, 17 Mar 2003 11:18:25 -0500 (EST) From: Ralph Doncaster To: jamal Cc: "mcr@sandelman.ottawa.on.ca" , "vortex@scyld.com" , "davem@redhat.com" Subject: Re: Linux router performance (3c59x) Hi Jamal, I found a 3c59x NAPI patch (from orr.falooley.org/pub/linux/net/, which seems to be down right now), and applied that against the stock 2.4.20 kernel. Unfortunately I don't see a noticable improvement from 2.4.19 without NAPI. When I send a 10kpps flood of 64-byte frames through the router, the CPU flatlines (duron 750). The number of interrupts/sec doesn't go down and the context switching is reduced so NAPI is having some affect, but not the intended reduction in CPU load (10kpps flood was done during the middle of this vmstat log, when you see idle go to 0): root@tor-router /usr/src# vmstat 2 procs memory swap io system cpu r b w swpd free buff cache si so bi bo in cs us sy id 0 0 1 932 623600 15636 27916 0 0 2 51 2762 1085 13 41 47 0 0 1 932 623600 15636 27916 0 0 0 0 18603 1660 0 43 57 0 0 1 932 623600 15636 27916 0 0 0 0 18495 1593 0 50 50 0 0 1 932 623584 15652 27916 0 0 0 20 18949 1671 0 49 51 0 0 1 932 623584 15652 27916 0 0 0 0 18768 1192 0 63 37 0 0 1 932 623584 15652 27916 0 0 0 0 16084 62 0 100 0 0 0 1 932 623584 15652 27916 0 0 0 0 16059 132 0 100 0 0 0 1 932 623576 15660 27916 0 0 0 8 18043 24 0 100 0 0 0 1 932 623576 15660 27916 0 0 0 0 17795 71 0 100 0 0 0 1 932 623576 15660 27916 0 0 0 0 14181 70 0 100 0 0 0 1 932 623576 15660 27916 0 0 0 0 16764 122 0 100 0 0 0 1 932 623576 15660 27916 0 0 0 0 16802 63 0 100 0 0 0 1 932 623568 15668 27916 0 0 0 8 17044 23 0 100 0 0 0 1 932 623568 15668 27916 0 0 0 0 19198 1520 0 48 52 0 0 1 932 623568 15668 27916 0 0 0 0 18684 1611 0 39 61 0 0 1 932 623568 15668 27916 0 0 0 0 18256 1518 0 44 56 This is a box doing straight routing (no firewalling), with a full bgp4 routing table (>100k routes). Kernel advanced router config option as well as fastroute was chosen. Is the 3c59x NAPI patch just no good, or is there something else I should be doing to get decent linux routing throughput with it? Ralph Doncaster principal, IStop.com On Sat, 14 Dec 2002, jamal wrote: > > > On Fri, 13 Dec 2002, Ralph Doncaster wrote: > > > Hi Jamal, > > > > I'm running 2.4.19 on a linux router in Toronto. It's got 2 3c905CX > > cards, and I've disabled rx_copybreak in the driver. FASTROUTE is not > > tuned on. CPU is a Duron-750. At around 40kpps, the box hits 100% CPU. > > > > And it should probably die if you start hitting around 60kpps i.e no > packets make it. > > > Based on your numbers for 2.2.14, it would seem FASTROUTE would make a big > > difference. > > http://robur.slu.se/Linux/net-development/jamal/FF-html/img7.htm > > > > It has its disadvantages: > It chews a lot of CPU and theres a lot of things you must bypass > by virtue of DMA-DMA connectivity. > > > Comparing the Usenix paper results for 2.4 seem to show that FASTROUTE > > doesn't make as much difference. Since your numbers show almost 100kpps > > I think if you only have a couple of interfaces on a P2 you should pretty > much be able to do about 100Kpps on each. > > > for regular 2.4 I'm guessing that means the irqmitigation of the stock > > 3c59x.c sucks, even though it looks like it will process multiple packets > > per interrupt under load (max_interrupt_work). > > irq mitigation is only done by a few NICs. NAPI does a better mitigation > in s/ware without requiring h/ware support. The mitigation is based on > feedback from the system; so if the system is slow (pentium vs P3) you > process less and NEVER die. I believe theres 3c59x.c NAPI driver. > > > DaveM was rather terse > > when I communicated with him recently, but what he clearly said was the > > e1000 is the best performer under linux due to software IRQ mitigation > > features in the driver (not the hardware RxIntDelay feature). > > > > He was more than likely refering to NAPI. e1000 is definetly the best; but > i dont own any; Robert Olson owns a few and he swears by them. I can email > him for details if you are interested. > > > Now that 2.4.20 includes the e1000 driver, it would seem the easiest way > > to get high-performance routing under Linux would be for me to upgrade > > from 2.4.19 to 2.4.20 with the FASTROUTE enabled, and swap my 3C905CX > > cards for a couple of e1000's. > > > > No. Forget FASTROUTE. I dont think anyone is looking at it at all or it > is ever being updated; we killed it with NAPI perfomance wise, no > difference and featurewise NAPI is superior. > Although recently i have been thinking of experimenting withe CISCO like > adjancecies/CEF (but that is a totaly different thing). > > > Looking at the README > > ftp://robur.slu.se/pub/Linux/net-development/NAPI/README > > It seems to indicate the 2.4.20 e1000 driver is NAPIfied, so I shouldn't > > need any NAPI patches. > > > > 2.4.20 already has NAPI built in. When you compile the kernel, you have > it. > > > Lastly, your comments to MCR about NAPI being better than FASTROUTE seem > > to imply that I don't need FASTROUTE. However I would expect FASTROUTE to > > provide additional performance when used with NAPI (since it avoids the > > codepath for firewalling & NAT). > > > > If you dont have any firewalling policies on theres no difference. > NAT is a different beast - that thing puts Linux to shame. > so, no you dont need FASTROUTE. > > cheers, > jamal > > From greearb@candelatech.com Mon Mar 17 20:48:28 2003 Received: with ECARTIS (v1.0.0; list netdev); Mon, 17 Mar 2003 20:48:31 -0800 (PST) Received: from grok.yi.org (IDENT:i310CLwQNXg+lmUPA94EVJ/GaxeEpdtH@dhcp93-dsl-usw3.w-link.net [206.129.84.93]) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h2I4mRq9019113 for ; Mon, 17 Mar 2003 20:48:28 -0800 Received: from candelatech.com (IDENT:FBAqSCWmwi3t9499U+VZcluml7HMdYK9@localhost.localdomain [127.0.0.1]) by grok.yi.org (8.11.6/8.11.6) with ESMTP id h2I4m8a26649; Mon, 17 Mar 2003 20:48:09 -0800 Message-ID: <3E76A508.30007@candelatech.com> Date: Mon, 17 Mar 2003 20:48:08 -0800 From: Ben Greear Organization: Candela Technologies User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.3b) Gecko/20030210 X-Accept-Language: en-us, en MIME-Version: 1.0 To: ralph+d@istop.com CC: netdev@oss.sgi.com Subject: Re: Linux router performance (3c59x) (fwd) References: In-Reply-To: Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) X-archive-position: 1958 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: greearb@candelatech.com Precedence: bulk X-list: netdev Content-Length: 1074 Lines: 32 Ralph Doncaster wrote: > I haven't heard from Jamal or Dave, so perhaps someone from this list has > some wisdom to impart. > Currently the box in question is running a 67% system load with ~40kpps. > Here's the switch port stats that the 2 3c905cx cards are plugged into: > > 5 minute input rate 36143000 bits/sec, 8914 packets/sec > 5 minute output rate 54338000 bits/sec, 10722 packets/sec > - > 5 minute input rate 50585000 bits/sec, 12445 packets/sec > 5 minute output rate 34326000 bits/sec, 9596 packets/sec When using larger packets, NAPI doesn't have much effect. Have you tried routing with simple routing tables to see if that speeds anything up? Could also try an e100 or Tulip NIC. Those usually work pretty good... Or, could use an e1000 GigE NIC... It's also possible that you are just reaching the limit of your system. Ben -- Ben Greear President of Candela Technologies Inc http://www.candelatech.com ScryMUD: http://scry.wanfear.com http://scry.wanfear.com/~greear From ralph@istop.com Mon Mar 17 21:10:10 2003 Received: with ECARTIS (v1.0.0; list netdev); Mon, 17 Mar 2003 21:10:47 -0800 (PST) Received: from smtp.istop.com (dci.doncaster.on.ca [66.11.168.194]) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h2I59Tq9030978 for ; Mon, 17 Mar 2003 21:10:10 -0800 Received: from ns.istop.com (ns.istop.com [66.11.168.199]) by smtp.istop.com (Postfix) with ESMTP id 81DF9369CD; Tue, 18 Mar 2003 00:09:28 -0500 (EST) Date: Tue, 18 Mar 2003 00:09:42 -0500 (EST) From: Ralph Doncaster Reply-To: ralph+d@istop.com To: Ben Greear Cc: "netdev@oss.sgi.com" Subject: Re: Linux router performance (3c59x) (fwd) In-Reply-To: <3E76A508.30007@candelatech.com> Message-ID: References: <3E76A508.30007@candelatech.com> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) X-archive-position: 1959 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: ralph@istop.com Precedence: bulk X-list: netdev Content-Length: 1604 Lines: 42 On Mon, 17 Mar 2003, Ben Greear wrote: > Ralph Doncaster wrote: [...] > > Currently the box in question is running a 67% system load with ~40kpps. > > Here's the switch port stats that the 2 3c905cx cards are plugged into: > > > > 5 minute input rate 36143000 bits/sec, 8914 packets/sec > > 5 minute output rate 54338000 bits/sec, 10722 packets/sec > > - > > 5 minute input rate 50585000 bits/sec, 12445 packets/sec > > 5 minute output rate 34326000 bits/sec, 9596 packets/sec > > When using larger packets, NAPI doesn't have much effect. So I should just give up on Linux and go with FreeBSD? http://info.iet.unipi.it/~luigi/polling/ > Have you tried routing with simple routing tables to see if that > speeds anything up? No, but I did read through a bunch of the route-cache code and even with the dynamic hashtable size introduced in recent 2.4 revs, it looks very ineficient for core routing. I'd expect a speedup with a small routing table, but then it would be useless as a core router in my network. > Could also try an e100 or Tulip NIC. Those usually work pretty > good... Or, could use an e1000 GigE NIC... If I can get confirmation that under similar conditions the e1000 performs significantly better, then I'll go that route. > It's also possible that you are just reaching the limit of your > system. The NAPI docs imply 144kpps is easily attainable on lesser hardware than mine. Also I can't see bandwidth being the issue as I'm moving <25Mbytes/sec over the PCI bus. I should be able to do more than double that before I have to worry about PCI saturation. -Ralph From greearb@candelatech.com Mon Mar 17 22:30:51 2003 Received: with ECARTIS (v1.0.0; list netdev); Mon, 17 Mar 2003 22:30:55 -0800 (PST) Received: from grok.yi.org (IDENT:k1HCooalvCfNsQtKVDEeZFYhe2Gu5Aay@dhcp93-dsl-usw3.w-link.net [206.129.84.93]) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h2I6Unq9004679 for ; Mon, 17 Mar 2003 22:30:50 -0800 Received: from candelatech.com (IDENT:8NC5ZwCIsZC9Zhav0mw0rrPZ1eJIXTC/@localhost.localdomain [127.0.0.1]) by grok.yi.org (8.11.6/8.11.6) with ESMTP id h2I6Ula07511; Mon, 17 Mar 2003 22:30:47 -0800 Message-ID: <3E76BD17.7060208@candelatech.com> Date: Mon, 17 Mar 2003 22:30:47 -0800 From: Ben Greear Organization: Candela Technologies User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.3b) Gecko/20030210 X-Accept-Language: en-us, en MIME-Version: 1.0 To: ralph+d@istop.com CC: "netdev@oss.sgi.com" Subject: Re: Linux router performance (3c59x) (fwd) References: <3E76A508.30007@candelatech.com> In-Reply-To: Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) X-archive-position: 1960 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: greearb@candelatech.com Precedence: bulk X-list: netdev Content-Length: 2899 Lines: 85 Ralph Doncaster wrote: > On Mon, 17 Mar 2003, Ben Greear wrote: > > >>Ralph Doncaster wrote: > > [...] > >>>Currently the box in question is running a 67% system load with ~40kpps. >>>Here's the switch port stats that the 2 3c905cx cards are plugged into: >>> >>> 5 minute input rate 36143000 bits/sec, 8914 packets/sec >>> 5 minute output rate 54338000 bits/sec, 10722 packets/sec >>>- >>> 5 minute input rate 50585000 bits/sec, 12445 packets/sec >>> 5 minute output rate 34326000 bits/sec, 9596 packets/sec >> >>When using larger packets, NAPI doesn't have much effect. > > > So I should just give up on Linux and go with FreeBSD? > http://info.iet.unipi.it/~luigi/polling/ It would be interesting to see a performance comparison. >>Have you tried routing with simple routing tables to see if that >>speeds anything up? > > No, but I did read through a bunch of the route-cache code and even with > the dynamic hashtable size introduced in recent 2.4 revs, it looks very > ineficient for core routing. I'd expect a speedup with a small routing > table, but then it would be useless as a core router in my network. So, if making the routing table smaller 'fixes' things, then NAPI and your NIC is not the problem. >>Could also try an e100 or Tulip NIC. Those usually work pretty >>good... Or, could use an e1000 GigE NIC... > > > If I can get confirmation that under similar conditions the e1000 performs > significantly better, then I'll go that route. In my testing, I could get about 140kpps (64-byte packets) tx or rx on a single port. Bi-directional I got about 90kpps. This was a 1.8Ghz AMD processor with a tulip driver. When using MTU sized packets, could fill 4 ports with tx+rx traffic at 90+Mbps. With e1000 on a 64/66 PCI bus, I could transmit around 860Mbps with 1500 byte packets (tx + rx on the same machine, but different ports of a dual-port NIC), and could generate maybe 400kpps with small packets (I don't remember the exact number here...) This was using a slightly modified (and slower) pktgen module, which is standard in the latest kernels. So, sending/receiving packets at extreme rates is possible. Routing with 100k entries may not work nearly so well. >>It's also possible that you are just reaching the limit of your >>system. > > > The NAPI docs imply 144kpps is easily attainable on lesser hardware than > mine. Also I can't see bandwidth being the issue as I'm moving > <25Mbytes/sec over the PCI bus. I should be able to do more than double > that before I have to worry about PCI saturation. So, test w/smaller routing tables so you can see if it's routing or the NIC that is slowing you down. > > -Ralph > -- Ben Greear President of Candela Technologies Inc http://www.candelatech.com ScryMUD: http://scry.wanfear.com http://scry.wanfear.com/~greear From Robert.Olsson@data.slu.se Tue Mar 18 01:55:41 2003 Received: with ECARTIS (v1.0.0; list netdev); Tue, 18 Mar 2003 01:55:47 -0800 (PST) Received: from robur.slu.se (robur.slu.se [130.238.98.12]) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h2I9sxq9008613 for ; Tue, 18 Mar 2003 01:55:40 -0800 Received: (from robert@localhost) by robur.slu.se (8.9.3/8.9.3) id KAA12892; Tue, 18 Mar 2003 10:54:48 +0100 From: Robert Olsson MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Message-ID: <15990.60648.230534.852040@robur.slu.se> Date: Tue, 18 Mar 2003 10:54:48 +0100 To: ralph+d@istop.com Cc: netdev@oss.sgi.com, Robert.Olsson@data.slu.se Subject: Re: Linux router performance (3c59x) (fwd) In-Reply-To: References: X-Mailer: VM 6.92 under Emacs 19.34.1 X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) X-archive-position: 1961 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: Robert.Olsson@data.slu.se Precedence: bulk X-list: netdev Content-Length: 2113 Lines: 52 Ralph Doncaster writes: > I haven't heard from Jamal or Dave, so perhaps someone from this list has > some wisdom to impart. > Currently the box in question is running a 67% system load with ~40kpps. > Here's the switch port stats that the 2 3c905cx cards are plugged into: Hello! First we do a lot of testing with routing path but have no experience with the hardware you have 3c59x or duron. In general it seems hard to extrapolate performance X1 % CPU at X2 pps. You don't see CPU used in IRQ context and not in some of softIRQ's. I think a better way for this tests is to input "overload" so your system gets saturated. You get the DoS test for free... After getting the throughput you have figure out what's your bottleneck CPU, PCI etc. > This is a box doing straight routing (no firewalling), with a full bgp4 > routing table (>100k routes). Kernel advanced router config option as > well as fastroute was chosen. The size of routing table itself has no effect... The challenge comes when there are a high number of new "flows" per second so garbage collection gets active. This can be seen with a program rtstat in the iproute2 package. Currently there is no driver with FASTROUTE support in the kernel so this will not do you any good now. But Linux routing (and packet overload) performance is still very good. You can see performance numbers as well as profiles for different setups http://robur.slu.se/Linux/net-development/experiments/router-profile.html As seen packet memory allocation is one of the CPU consumers. And also we see that slab is not not fully per CPU so we are spinning in case of SMP. And as seen UP gives about 345 kpps. With skb recycling bump this up to 507 kpps. The challenge for now is to get aggregated performance with SMP. Also remember that network and routing in particular is very much data transport which is DMA transfers from and to memory and these has to interact with CPU/driver arbitrating for the bus to manage this DMA's. Latencies and serializations are not obvious at this level. Cheers. --ro From erik@hensema.net Tue Mar 18 05:46:14 2003 Received: with ECARTIS (v1.0.0; list netdev); Tue, 18 Mar 2003 05:46:18 -0800 (PST) Received: from dexter.hensema.net (cc78409-a.hnglo1.ov.home.nl [212.120.97.185]) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h2IDkAq9023728 for ; Tue, 18 Mar 2003 05:46:12 -0800 Received: from bender.home.hensema.net (bender.ipv6.hensema.net [IPv6:2001:888:10a1:0:202:44ff:fe69:60f5]) by dexter.hensema.net (8.12.3/8.12.3) with ESMTP id h2IDk7PA023144 for ; Tue, 18 Mar 2003 14:46:07 +0100 Received: from bender.home.hensema.net ([127.0.0.1]) by bender.home.hensema.net (8.12.3/8.12.3) with ESMTP id h2IDB5nB016389 for ; Tue, 18 Mar 2003 14:11:05 +0100 Received: (from erik@localhost) by bender.home.hensema.net (8.12.3/8.12.3/Submit) id h2IDB4iP016388 for netdev@oss.sgi.com; Tue, 18 Mar 2003 14:11:04 +0100 Date: Tue, 18 Mar 2003 14:11:04 +0100 From: Erik Hensema To: netdev@oss.sgi.com Subject: TCP/IPv6 broken in Linux 2.5.64? Message-ID: <20030318131104.GA16367@hensema.net> Reply-To: erik@hensema.net Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.3.27i X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) X-archive-position: 1962 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: erik@hensema.net Precedence: bulk X-list: netdev Content-Length: 2916 Lines: 35 Hi, I'm trying to upgrade to Linux 2.5.x from 2.4.x. It seems to be working fine, except for IPv6: a TCP session can be established, but I can't send data. I can receive it though. This is a tcpdump session of me telnetting to the smtp port of my server, with is on local ethernet; native ipv6: 14:01:24.169321 bender.ipv6.hensema.net.32926 > dexter.ipv6.hensema.net.smtp: S 3441486378:3441486378(0) win 5760 14:01:24.169532 dexter.ipv6.hensema.net.smtp > bender.ipv6.hensema.net.32926: S 2141989668:2141989668(0) ack 3441486379 win 5712 14:01:24.170104 bender.ipv6.hensema.net.32926 > dexter.ipv6.hensema.net.smtp: . ack 1 win 5760 14:01:24.187583 dexter.ipv6.hensema.net.47911 > bender.ipv6.hensema.net.ident: S 2130642408:2130642408(0) win 5760 14:01:24.187612 bender.ipv6.hensema.net.ident > dexter.ipv6.hensema.net.47911: R 0:0(0) ack 2130642409 win 0 14:01:24.198246 dexter.ipv6.hensema.net.smtp > bender.ipv6.hensema.net.32926: P 1:87(86) ack 1 win 5712 14:01:24.198285 bender.ipv6.hensema.net.32926 > dexter.ipv6.hensema.net.smtp: . ack 87 win 5760 14:01:27.397607 bender.ipv6.hensema.net.32926 > dexter.ipv6.hensema.net.smtp: P 1:7(6) ack 87 win 5760 14:01:27.598549 bender.ipv6.hensema.net.32926 > dexter.ipv6.hensema.net.smtp: P 1:7(6) ack 87 win 5760 14:01:28.199460 bender.ipv6.hensema.net.32926 > dexter.ipv6.hensema.net.smtp: P 1:7(6) ack 87 win 5760 14:01:29.223402 bender.ipv6.hensema.net.32926 > dexter.ipv6.hensema.net.smtp: P 1:7(6) ack 87 win 5760 14:01:31.015042 bender.ipv6.hensema.net.32926 > dexter.ipv6.hensema.net.smtp: P 1:7(6) ack 87 win 5760 14:01:34.342488 bender.ipv6.hensema.net.32926 > dexter.ipv6.hensema.net.smtp: P 1:7(6) ack 87 win 5760 14:01:40.997436 bender.ipv6.hensema.net.32926 > dexter.ipv6.hensema.net.smtp: P 1:7(6) ack 87 win 5760 14:01:53.795498 bender.ipv6.hensema.net.32925 > dexter.ipv6.hensema.net.smtp: P 0:6(6) ack 1 win 5760 14:01:54.051386 bender.ipv6.hensema.net.32926 > dexter.ipv6.hensema.net.smtp: P 1:7(6) ack 87 win 5760 I do see the SMTP greeting. However, when I send a RSET, there's no response from the server. IPv4 is working fine. icmpv6/ipv6 and udp/ipv6 too. bender is running linux 2.5.64. dexter is running linux 2.4.18, mostly the SuSE 8.0 version (that is: quite heavily patched). -- Erik Hensema (erik@hensema.net) From solt@dns.toxicfilms.tv Tue Mar 18 05:54:59 2003 Received: with ECARTIS (v1.0.0; list netdev); Tue, 18 Mar 2003 05:55:01 -0800 (PST) Received: from dns.toxicfilms.tv (dns.toxicfilms.tv [150.254.37.24]) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h2IDsHq9024095 for ; Tue, 18 Mar 2003 05:54:58 -0800 Received: by dns.toxicfilms.tv (Postfix, from userid 1000) id 8EE2E1CD30; Tue, 18 Mar 2003 14:54:15 +0100 (CET) Received: from localhost (localhost [127.0.0.1]) by dns.toxicfilms.tv (Postfix) with ESMTP id 600AE80016; Tue, 18 Mar 2003 14:54:15 +0100 (CET) Date: Tue, 18 Mar 2003 14:54:15 +0100 (CET) From: Maciej Soltysiak To: Erik Hensema Cc: netdev@oss.sgi.com Subject: Re: TCP/IPv6 broken in Linux 2.5.64? In-Reply-To: <20030318131104.GA16367@hensema.net> Message-ID: References: <20030318131104.GA16367@hensema.net> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) X-archive-position: 1963 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: solt@dns.toxicfilms.tv Precedence: bulk X-list: netdev Content-Length: 195 Lines: 8 Hi, > bender is running linux 2.5.64. dexter is running linux 2.4.18, Are you a cartoons fan ? Dexter's laboratory, Futurama ? That's a cool way to name hosts. I am inspired :) Regards, Maciej From ahu@outpost.ds9a.nl Tue Mar 18 08:10:14 2003 Received: with ECARTIS (v1.0.0; list netdev); Tue, 18 Mar 2003 08:10:24 -0800 (PST) Received: from outpost.ds9a.nl (postfix@outpost.ds9a.nl [213.244.168.210]) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h2IG9Xq9027286 for ; Tue, 18 Mar 2003 08:10:14 -0800 Received: by outpost.ds9a.nl (Postfix, from userid 1000) id D7E894020; Tue, 18 Mar 2003 17:09:31 +0100 (CET) Date: Tue, 18 Mar 2003 17:09:31 +0100 From: bert hubert To: netdev@oss.sgi.com Cc: linux-kernel@vger.kernel.org Subject: interop success, ip6sec Linux 2.5.65 vs FreeBSD 4.7-STABLE Message-ID: <20030318160931.GA9529@outpost.ds9a.nl> Mail-Followup-To: bert hubert , netdev@oss.sgi.com, linux-kernel@vger.kernel.org Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.3.28i X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) X-archive-position: 1964 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: ahu@ds9a.nl Precedence: bulk X-list: netdev Content-Length: 698 Lines: 20 Thanks to Niels Bakker I can report that stock Linux 2.5.65 can talk IP6SEC with FreeBSD 4.7-STABLE (KAME 20010528/FreeBSD). It worked on the first try. This using ipsec-tools-0.2.2 and manual keying. Racoon is reported not to listen on IPv6 sockets yet, but we didn't try this. The configuration used is that described in http://lartc.org/howto/lartc.ipsec.html 'Intro with Manual Keying', with IPv4 addresses replaced by IPv6 addresses. Thanks for making this possible! Regards, Bert -- http://www.PowerDNS.com Open source, database driven DNS Software http://lartc.org Linux Advanced Routing & Traffic Control HOWTO http://netherlabs.nl Consulting From ahu@outpost.ds9a.nl Tue Mar 18 08:26:15 2003 Received: with ECARTIS (v1.0.0; list netdev); Tue, 18 Mar 2003 08:26:18 -0800 (PST) Received: from outpost.ds9a.nl (postfix@outpost.ds9a.nl [213.244.168.210]) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h2IGPXq9027668 for ; Tue, 18 Mar 2003 08:26:14 -0800 Received: by outpost.ds9a.nl (Postfix, from userid 1000) id C8AE740F7; Tue, 18 Mar 2003 17:25:32 +0100 (CET) Date: Tue, 18 Mar 2003 17:25:32 +0100 From: bert hubert To: Erik Hensema Cc: netdev@oss.sgi.com Subject: Re: TCP/IPv6 broken in Linux 2.5.64? Message-ID: <20030318162532.GA9705@outpost.ds9a.nl> Mail-Followup-To: bert hubert , Erik Hensema , netdev@oss.sgi.com References: <20030318131104.GA16367@hensema.net> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20030318131104.GA16367@hensema.net> User-Agent: Mutt/1.3.28i X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) X-archive-position: 1965 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: ahu@ds9a.nl Precedence: bulk X-list: netdev Content-Length: 1579 Lines: 51 On Tue, Mar 18, 2003 at 02:11:04PM +0100, Erik Hensema wrote: > Hi, > > I'm trying to upgrade to Linux 2.5.x from 2.4.x. It seems to be working > fine, except for IPv6: a TCP session can be established, but I can't send > data. I can receive it though. I can confirm this. 2.5.65 can connect to other hosts out there, like ipv6 irc servers, or an IPv6 zonetransfer. However, when I try to ssh from 2.5.65 to another 2.5.65, nothing happens. 2.4.18 can also ssh to 2.5.65. So we have: to: 2.4.18 2.5.65 from: 2.4.18 OK OK 2.5.65 OK ERROR tcpdump from 2.5.65 to 2.5.65: 25.987220 hostA.33180 > hostB.22: S 2721590261:2721590261(0) win 5760 25.987490 hostB.22 > hostA.33180: S 2377513523:2377513523(0) ack 2721590262 win 5712 25.987622 hostA.33180 > hostB.22: . ack 1 win 5760 25.993273 hostB.22 > hostA.33180: P 1:41(40) ack 1 win 5712 26.193443 hostB.22 > hostA.33180: P 1:41(40) ack 1 win 5712 26.757236 hostB.22 > hostA.33180: P 1:41(40) ack 1 win 5712 The originating host does not ACK the received data, it appears. No IPSEC is involved with this. Regards, bert -- http://www.PowerDNS.com Open source, database driven DNS Software http://lartc.org Linux Advanced Routing & Traffic Control HOWTO http://netherlabs.nl Consulting From ahu@outpost.ds9a.nl Tue Mar 18 08:51:27 2003 Received: with ECARTIS (v1.0.0; list netdev); Tue, 18 Mar 2003 08:51:32 -0800 (PST) Received: from outpost.ds9a.nl (postfix@outpost.ds9a.nl [213.244.168.210]) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h2IGpPq9028216 for ; Tue, 18 Mar 2003 08:51:26 -0800 Received: by outpost.ds9a.nl (Postfix, from userid 1000) id 80DA044D6; Tue, 18 Mar 2003 17:51:24 +0100 (CET) Date: Tue, 18 Mar 2003 17:51:24 +0100 From: bert hubert To: Erik Hensema , netdev@oss.sgi.com Subject: Re: TCP/IPv6 broken in Linux 2.5.64? Message-ID: <20030318165124.GA10127@outpost.ds9a.nl> Mail-Followup-To: bert hubert , Erik Hensema , netdev@oss.sgi.com References: <20030318131104.GA16367@hensema.net> <20030318162532.GA9705@outpost.ds9a.nl> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20030318162532.GA9705@outpost.ds9a.nl> User-Agent: Mutt/1.3.28i X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) X-archive-position: 1966 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: ahu@ds9a.nl Precedence: bulk X-list: netdev Content-Length: 2564 Lines: 65 On Tue, Mar 18, 2003 at 05:25:32PM +0100, bert hubert wrote: > So we have: > > to: 2.4.18 2.5.65 > from: > 2.4.18 OK OK > 2.5.65 OK ERROR Ok, here is a counter example of 2.5.65 happily talking to 2.5.65, so it sometimes does work: 33.933107 hostC.33255 > hostB.22: SWE 3663286543:3663286543(0) win 5680 33.933203 hostB.22 > hostC.33255: S 3554639364:3554639364(0) ack 3663286544 win 5712 33.999407 hostC.33255 > hostB.22: . ack 1 win 5680 34.007239 hostB.22 > hostC.33255: P 1:41(40) ack 1 win 5712 34.072108 hostC.33255 > hostB.22: . ack 41 win 5680 34.091633 hostC.33255 > hostB.22: P 1:40(39) ack 41 win 5680 34.097058 hostB.22 > hostC.33255: . ack 40 win 5712 Here is a macos X laptop, hostD trying and failing talk to hostB, which runs 2.5.65: 16.829237 hostD.56023 > hostB.22: S 3755233012:3755233012(0) win 32768 16.829482 hostB.22 > hostD.56023: S 3776737964:3776737964(0) ack 3755233013 win 5712 17.296953 hostD.56023 > hostB.22: . ack 1 win 32844 17.301934 hostB.22 > hostD.56023: P 1:41(40) ack 1 win 5712 18.953105 hostB.22 > hostD.56023: P 1:41(40) ack 1 win 5712 21.768126 hostB.22 > hostD.56023: P 1:41(40) ack 1 win 5712 27.398163 hostB.22 > hostD.56023: P 1:41(40) ack 1 win 5712 Closer inspection shows: 17:47:39.206188 HostB.22 > HostD.56030: P [bad tcp cksum 407f!] 1:41(40) ack 1 win 5712 (len 72, hlim 64) Note the bad checksum! It appears hostB is the culprit here, the one constant factor in the entire story. HostB is a Pentium PRO, the other machines aren't. Perhaps this might be it? This dump was run on hostB, so no chance of bad media there. Let me know if I can do more research - icmp6 works just fine. Regards, bert -- http://www.PowerDNS.com Open source, database driven DNS Software http://lartc.org Linux Advanced Routing & Traffic Control HOWTO http://netherlabs.nl Consulting From larslan@merete.zapto.org Tue Mar 18 10:29:19 2003 Received: with ECARTIS (v1.0.0; list netdev); Tue, 18 Mar 2003 10:29:23 -0800 (PST) Received: from merete.balder.no (197.80-202-160.nextgentel.com [80.202.160.197]) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h2IITGq9002055 for ; Tue, 18 Mar 2003 10:29:18 -0800 Received: from localhost (larslan@localhost) by merete.balder.no (8.11.6/8.11.6) with ESMTP id h2IIM8w14281 for ; Tue, 18 Mar 2003 19:22:09 +0100 Date: Tue, 18 Mar 2003 19:22:08 +0100 (CET) From: Lars Landmark X-X-Sender: larslan@merete.balder.no To: netdev@oss.sgi.com Subject: class/qdisc question Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) X-archive-position: 1967 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: larslan@merete.zapto.org Precedence: bulk X-list: netdev Content-Length: 1664 Lines: 56 HI; I am trying to write my own class based queue. But as usual some problems seems not to be resolved. I have achieved to send package through my queue. This can be done if I not attach class or filters. If I do try to attach class or filter, my computer stops. I can not read any message, nor do anything. My only choice is to push power button in order to reboot. My "queue" is compiled as module and if I do "insmod", it is loaded in to kernel. This operation do not report any error. [root@lars larslan]# /sbin/insmod sch_kll Using /lib/modules/2.4.20/kernel/net/sched/sch_kll.o /sbin/lsmod report Module Size Used by Not tainted sch_htb 21088 1 sch_kll 9608 0 (autoclean) (unused) 3c59x 28520 2 When I now configure this module width my patched tc file ************ root@lars iproute2.lars]# ./tc/tc qdisc add dev eth0 root handle 1: kll default 5 ************ output from /sbin/lsmod do not change. It still says used by 0 and (unused). I have written som output in every procedyre, and dmesg report that this procedures are called ************* [root@lars iproute2.lars]# dmesg KLL: inne i classify? KLL: inne i dequeue? KLL: inne i dequeue? .... ************* So my question is, how can this happen? I thought that at once I configure module, modules usage count would be incremented??? Is there any possibility, that when I attach a filter my computer crash because kernel do not know kll-module is in use??? I would be very happy if some could tell me what I have been missed... Any suggestion is appreciated, Thanks in advance Lars Student From mk@karaba.org Tue Mar 18 10:32:30 2003 Received: with ECARTIS (v1.0.0; list netdev); Tue, 18 Mar 2003 10:32:34 -0800 (PST) Received: from zanzibar.karaba.org (karaba.org [218.219.152.88] (may be forged)) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h2IIWSq9002432 for ; Tue, 18 Mar 2003 10:32:29 -0800 Received: from [3ffe:501:1057:710::53] (helo=hyakusiki.karaba.org) by zanzibar.karaba.org with esmtp (Exim 3.35 #1 (Debian)) id 18vLsb-0005QQ-00; Wed, 19 Mar 2003 03:32:05 +0900 Date: Tue, 18 Mar 2003 10:32:27 -0800 Message-ID: <87of48h6f8.wl@karaba.org> From: Mitsuru KANDA / =?ISO-2022-JP?B?GyRCP0BFRBsoQiAbJEI9PBsoQg==?= To: davem@redhat.com, kuznet@ms2.inr.ac.ru Cc: linux-kernel@vger.kernel.org, netdev@oss.sgi.com, usagi-core@linux-ipv6.org Subject: [PATCH] IPv6 Extension headers (Re: [PATCH] IPv6 IPsec support) In-Reply-To: <20030305.204348.130225511.davem@redhat.com> References: <20030305233025.784feb00.kazunori@miyazawa.org> <20030305.152530.70806720.davem@redhat.com> <20030306093219.1a702868.kazunori@miyazawa.org> <20030305.204348.130225511.davem@redhat.com> MIME-Version: 1.0 (generated by SEMI 1.14.4 - "Hosorogi") Content-Type: text/plain; charset=US-ASCII X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) X-archive-position: 1968 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: mk@karaba.org Precedence: bulk X-list: netdev Content-Length: 21766 Lines: 789 Hello, At Wed, 05 Mar 2003 20:43:48 -0800 (PST), "David S. Miller" wrote: > > From: Kazunori Miyazawa > Date: Thu, 6 Mar 2003 09:32:19 +0900 > > - Extension Header Processing on inbound: > As a result of IPv6 IPsec support, Extension Header processing is devided > into ipv6_parse_exthdrs and ipproto->handler. I think it is better to merge > other Extension Header handling into ipproto->handler. > > Ok. This patch merges inbound IPv6 extension header processing parts into inet6_protocols{} like a IPv6 AH/ESP headers. As a result of this patch, I removed destopt parsing part in xfrm6_rcv() and removed ipv6_parse_exthdrs(). Could you check this patch? (This patch is against 2.5.65.) Best Regards, -mk Index: include/net/ipv6.h =================================================================== RCS file: /cvsroot/usagi/usagi-backport/linux25/include/net/ipv6.h,v retrieving revision 1.1.1.4 diff -u -r1.1.1.4 ipv6.h --- include/net/ipv6.h 9 Jan 2003 11:14:19 -0000 1.1.1.4 +++ include/net/ipv6.h 18 Mar 2003 05:11:39 -0000 @@ -203,11 +203,7 @@ extern int ip6_call_ra_chain(struct sk_buff *skb, int sel); -extern int ipv6_reassembly(struct sk_buff **skb, int); - extern int ipv6_parse_hopopts(struct sk_buff *skb, int); - -extern int ipv6_parse_exthdrs(struct sk_buff **skb, int); extern struct ipv6_txoptions * ipv6_dup_options(struct sock *sk, struct ipv6_txoptions *opt); Index: include/net/protocol.h =================================================================== RCS file: /cvsroot/usagi/usagi-backport/linux25/include/net/protocol.h,v retrieving revision 1.1.1.3 diff -u -r1.1.1.3 protocol.h --- include/net/protocol.h 11 Nov 2002 04:08:20 -0000 1.1.1.3 +++ include/net/protocol.h 18 Mar 2003 05:11:39 -0000 @@ -44,7 +44,7 @@ #if defined(CONFIG_IPV6) || defined (CONFIG_IPV6_MODULE) struct inet6_protocol { - int (*handler)(struct sk_buff *skb); + int (*handler)(struct sk_buff **skbp); void (*err_handler)(struct sk_buff *skb, struct inet6_skb_parm *opt, Index: include/net/transp_v6.h =================================================================== RCS file: /cvsroot/usagi/usagi-backport/linux25/include/net/transp_v6.h,v retrieving revision 1.1.1.1 diff -u -r1.1.1.1 transp_v6.h --- include/net/transp_v6.h 7 Oct 2002 10:22:46 -0000 1.1.1.1 +++ include/net/transp_v6.h 18 Mar 2003 05:11:39 -0000 @@ -15,6 +15,14 @@ struct flowi; +/* extention headers */ +extern void ipv6_hopopts_init(void); +extern void ipv6_rthdr_init(void); +extern void ipv6_frag_init(void); +extern void ipv6_nodata_init(void); +extern void ipv6_destopt_init(void); + +/* transport protocols */ extern void rawv6_init(void); extern void udpv6_init(void); extern void tcpv6_init(void); Index: include/net/xfrm.h =================================================================== RCS file: /cvsroot/usagi/usagi-backport/linux25/include/net/xfrm.h,v retrieving revision 1.1.1.8 diff -u -r1.1.1.8 xfrm.h --- include/net/xfrm.h 13 Mar 2003 17:29:53 -0000 1.1.1.8 +++ include/net/xfrm.h 18 Mar 2003 05:11:39 -0000 @@ -415,7 +415,7 @@ extern void xfrm_replay_advance(struct xfrm_state *x, u32 seq); extern int xfrm_check_selectors(struct xfrm_state **x, int n, struct flowi *fl); extern int xfrm4_rcv(struct sk_buff *skb); -extern int xfrm6_rcv(struct sk_buff *skb); +extern int xfrm6_rcv(struct sk_buff **pskb); extern int xfrm6_clear_mutable_options(struct sk_buff *skb, u16 *nh_offset, int dir); extern int xfrm_user_policy(struct sock *sk, int optname, u8 *optval, int optlen); Index: net/ipv4/xfrm_input.c =================================================================== RCS file: /cvsroot/usagi/usagi-backport/linux25/net/ipv4/xfrm_input.c,v retrieving revision 1.1.1.4 diff -u -r1.1.1.4 xfrm_input.c --- net/ipv4/xfrm_input.c 13 Mar 2003 17:29:03 -0000 1.1.1.4 +++ net/ipv4/xfrm_input.c 18 Mar 2003 05:11:39 -0000 @@ -311,8 +311,9 @@ return nexthdr; } -int xfrm6_rcv(struct sk_buff *skb) +int xfrm6_rcv(struct sk_buff **pskb) { + struct sk_buff *skb = *pskb; int err; u32 spi, seq; struct xfrm_state *xfrm_vec[XFRM_MAX_DEPTH]; @@ -325,12 +326,8 @@ u16 nh_offset = 0; u8 nexthdr = 0; - if (hdr->nexthdr == IPPROTO_AH || hdr->nexthdr == IPPROTO_ESP) { - nh_offset = ((unsigned char*)&skb->nh.ipv6h->nexthdr) - skb->nh.raw; - hdr_len = sizeof(struct ipv6hdr); - } else { - hdr_len = skb->h.raw - skb->nh.raw; - } + nh_offset = ((unsigned char*)&skb->nh.ipv6h->nexthdr) - skb->nh.raw; + hdr_len = sizeof(struct ipv6hdr); tmp_hdr = kmalloc(hdr_len, GFP_ATOMIC); if (!tmp_hdr) @@ -378,18 +375,6 @@ xfrm_vec[xfrm_nr++] = x; iph = skb->nh.ipv6h; /* ??? */ - - if (nexthdr == NEXTHDR_DEST) { - if (!pskb_may_pull(skb, (skb->h.raw-skb->data)+8) || - !pskb_may_pull(skb, (skb->h.raw-skb->data)+((skb->h.raw[1]+1)<<3))) { - err = -EINVAL; - goto drop; - } - nexthdr = skb->h.raw[0]; - nh_offset = skb->h.raw - skb->nh.raw; - skb_pull(skb, (skb->h.raw[1]+1)<<3); - skb->h.raw = skb->data; - } if (x->props.mode) { /* XXX */ if (iph->nexthdr != IPPROTO_IPV6) Index: net/ipv6/af_inet6.c =================================================================== RCS file: /cvsroot/usagi/usagi-backport/linux25/net/ipv6/af_inet6.c,v retrieving revision 1.1.1.7 diff -u -r1.1.1.7 af_inet6.c --- net/ipv6/af_inet6.c 25 Feb 2003 05:33:26 -0000 1.1.1.7 +++ net/ipv6/af_inet6.c 18 Mar 2003 05:11:40 -0000 @@ -793,6 +793,13 @@ addrconf_init(); sit_init(); + /* Init v6 extention headers. */ + ipv6_hopopts_init(); + ipv6_rthdr_init(); + ipv6_frag_init(); + ipv6_nodata_init(); + ipv6_destopt_init(); + /* Init v6 transport protocols. */ udpv6_init(); tcpv6_init(); Index: net/ipv6/exthdrs.c =================================================================== RCS file: /cvsroot/usagi/usagi-backport/linux25/net/ipv6/exthdrs.c,v retrieving revision 1.1.1.3 diff -u -r1.1.1.3 exthdrs.c --- net/ipv6/exthdrs.c 20 Feb 2003 08:34:32 -0000 1.1.1.3 +++ net/ipv6/exthdrs.c 18 Mar 2003 05:11:40 -0000 @@ -18,6 +18,9 @@ /* Changes: * yoshfuji : ensure not to overrun while parsing * tlv options. + * Mitsuru KANDA @USAGI : Remove ipv6_parse_exthdrs(). + * : Register inbound extention header + * : handlers as inet6_protocol{}. */ #include @@ -44,20 +47,6 @@ #include /* - * Parsing inbound headers. - * - * Parsing function "func" returns offset wrt skb->nh of the place, - * where next nexthdr value is stored or NULL, if parsing - * failed. It should also update skb->h tp point at the next header. - */ - -struct hdrtype_proc -{ - int type; - int (*func) (struct sk_buff **, int offset); -}; - -/* * Parsing tlv encoded headers. * * Parsing function "func" returns 1, if parsing succeed @@ -164,49 +153,77 @@ {-1, NULL} }; -static int ipv6_dest_opt(struct sk_buff **skb_ptr, int nhoff) +int ipv6_destopt_rcv(struct sk_buff **skbp) { - struct sk_buff *skb=*skb_ptr; + struct sk_buff *skb = *skbp; struct inet6_skb_parm *opt = (struct inet6_skb_parm *)skb->cb; + u8 nexthdr = 0; if (!pskb_may_pull(skb, (skb->h.raw-skb->data)+8) || !pskb_may_pull(skb, (skb->h.raw-skb->data)+((skb->h.raw[1]+1)<<3))) { kfree_skb(skb); - return -1; + return 0; } + nexthdr = ((struct ipv6_destopt_hdr *)skb->h.raw)->nexthdr; + opt->dst1 = skb->h.raw - skb->nh.raw; if (ip6_parse_tlv(tlvprocdestopt_lst, skb)) { skb->h.raw += ((skb->h.raw[1]+1)<<3); - return opt->dst1; + return -nexthdr; } + + return 0; +} - return -1; +static struct inet6_protocol destopt_protocol = +{ + .handler = ipv6_destopt_rcv, +}; + +void __init ipv6_destopt_init(void) +{ + if (inet6_add_protocol(&destopt_protocol, IPPROTO_DSTOPTS) < 0) + printk(KERN_ERR "ipv6_destopt_init: Could not register protocol\n"); } /******************************** NONE header. No data in packet. ********************************/ -static int ipv6_nodata(struct sk_buff **skb_ptr, int nhoff) +int ipv6_nodata_rcv(struct sk_buff **skbp) { - kfree_skb(*skb_ptr); - return -1; + struct sk_buff *skb = *skbp; + + kfree_skb(skb); + return 0; +} + +static struct inet6_protocol nodata_protocol = +{ + .handler = ipv6_nodata_rcv, +}; + +void __init ipv6_nodata_init(void) +{ + if (inet6_add_protocol(&nodata_protocol, IPPROTO_NONE) < 0) + printk(KERN_ERR "ipv6_nodata_init: Could not register protocol\n"); } /******************************** Routing header. ********************************/ -static int ipv6_routing_header(struct sk_buff **skb_ptr, int nhoff) +int ipv6_rthdr_rcv(struct sk_buff **skbp) { - struct sk_buff *skb = *skb_ptr; + struct sk_buff *skb = *skbp; struct inet6_skb_parm *opt = (struct inet6_skb_parm *)skb->cb; struct in6_addr *addr; struct in6_addr daddr; int addr_type; int n, i; + u8 nexthdr = 0; struct ipv6_rt_hdr *hdr; struct rt0_hdr *rthdr; @@ -215,15 +232,16 @@ !pskb_may_pull(skb, (skb->h.raw-skb->data)+((skb->h.raw[1]+1)<<3))) { IP6_INC_STATS_BH(Ip6InHdrErrors); kfree_skb(skb); - return -1; + return 0; } hdr = (struct ipv6_rt_hdr *) skb->h.raw; + nexthdr = hdr->nexthdr; if ((ipv6_addr_type(&skb->nh.ipv6h->daddr)&IPV6_ADDR_MULTICAST) || skb->pkt_type != PACKET_HOST) { kfree_skb(skb); - return -1; + return 0; } looped_back: @@ -232,24 +250,24 @@ skb->h.raw += (hdr->hdrlen + 1) << 3; opt->dst0 = opt->dst1; opt->dst1 = 0; - return (&hdr->nexthdr) - skb->nh.raw; + return -nexthdr; } if (hdr->type != IPV6_SRCRT_TYPE_0 || (hdr->hdrlen & 0x01)) { icmpv6_param_prob(skb, ICMPV6_HDR_FIELD, hdr->type != IPV6_SRCRT_TYPE_0 ? 2 : 1); - return -1; + return 0; } /* * This is the routing header forwarding algorithm from - * RFC 1883, page 17. + * RFC 2460, page 16. */ n = hdr->hdrlen >> 1; if (hdr->segments_left > n) { icmpv6_param_prob(skb, ICMPV6_HDR_FIELD, (&hdr->segments_left) - skb->nh.raw); - return -1; + return 0; } /* We are about to mangle packet header. Be careful! @@ -259,8 +277,8 @@ struct sk_buff *skb2 = skb_copy(skb, GFP_ATOMIC); kfree_skb(skb); if (skb2 == NULL) - return -1; - *skb_ptr = skb = skb2; + return 0; + *skbp = skb = skb2; opt = (struct inet6_skb_parm *)skb2->cb; hdr = (struct ipv6_rt_hdr *) skb2->h.raw; } @@ -278,7 +296,7 @@ if (addr_type&IPV6_ADDR_MULTICAST) { kfree_skb(skb); - return -1; + return 0; } ipv6_addr_copy(&daddr, addr); @@ -289,23 +307,34 @@ ip6_route_input(skb); if (skb->dst->error) { dst_input(skb); - return -1; + return 0; } if (skb->dst->dev->flags&IFF_LOOPBACK) { if (skb->nh.ipv6h->hop_limit <= 1) { icmpv6_send(skb, ICMPV6_TIME_EXCEED, ICMPV6_EXC_HOPLIMIT, 0, skb->dev); kfree_skb(skb); - return -1; + return 0; } skb->nh.ipv6h->hop_limit--; goto looped_back; } dst_input(skb); - return -1; + return 0; } +static struct inet6_protocol rthdr_protocol = +{ + .handler = ipv6_rthdr_rcv, +}; + +void __init ipv6_rthdr_init(void) +{ + if (inet6_add_protocol(&rthdr_protocol, IPPROTO_ROUTING) < 0) + printk(KERN_ERR "ipv6_rthdr_init: Could not register protocol\n"); +}; + /* This function inverts received rthdr. NOTE: specs allow to make it automatically only if @@ -371,97 +400,6 @@ return opt; } -/******************************** - AUTH header. - ********************************/ - -/* - rfc1826 said, that if a host does not implement AUTH header - it MAY ignore it. We use this hole 8) - - Actually, now we can implement OSPFv6 without kernel IPsec. - Authentication for poors may be done in user space with the same success. - - Yes, it means, that we allow application to send/receive - raw authentication header. Apparently, we suppose, that it knows - what it does and calculates authentication data correctly. - Certainly, it is possible only for udp and raw sockets, but not for tcp. - - AUTH header has 4byte granular length, which kills all the idea - behind AUTOMATIC 64bit alignment of IPv6. Now we will lose - cpu ticks, checking that sender did not something stupid - and opt->hdrlen is even. Shit! --ANK (980730) - */ - -static int ipv6_auth_hdr(struct sk_buff **skb_ptr, int nhoff) -{ - struct sk_buff *skb=*skb_ptr; - struct inet6_skb_parm *opt = (struct inet6_skb_parm *)skb->cb; - int len; - - if (!pskb_may_pull(skb, (skb->h.raw-skb->data)+8)) - goto fail; - - /* - * RFC2402 2.2 Payload Length - * The 8-bit field specifies the length of AH in 32-bit words - * (4-byte units), minus "2". - * -- Noriaki Takamiya @USAGI Project - */ - len = (skb->h.raw[1]+2)<<2; - - if (len&7) - goto fail; - - if (!pskb_may_pull(skb, (skb->h.raw-skb->data)+len)) - goto fail; - - opt->auth = skb->h.raw - skb->nh.raw; - skb->h.raw += len; - return opt->auth; - -fail: - kfree_skb(skb); - return -1; -} - -/* This list MUST NOT contain entry for NEXTHDR_HOP. - It is parsed immediately after packet received - and if it occurs somewhere in another place we must - generate error. - */ - -static struct hdrtype_proc hdrproc_lst[] = { - {NEXTHDR_FRAGMENT, ipv6_reassembly}, - {NEXTHDR_ROUTING, ipv6_routing_header}, - {NEXTHDR_DEST, ipv6_dest_opt}, - {NEXTHDR_NONE, ipv6_nodata}, - {NEXTHDR_AUTH, ipv6_auth_hdr}, - /* - {NEXTHDR_ESP, ipv6_esp_hdr}, - */ - {-1, NULL} -}; - -int ipv6_parse_exthdrs(struct sk_buff **skb_in, int nhoff) -{ - struct hdrtype_proc *hdrt; - u8 nexthdr = (*skb_in)->nh.raw[nhoff]; - -restart: - for (hdrt=hdrproc_lst; hdrt->type >= 0; hdrt++) { - if (hdrt->type == nexthdr) { - if ((nhoff = hdrt->func(skb_in, nhoff)) >= 0) { - nexthdr = (*skb_in)->nh.raw[nhoff]; - goto restart; - } - return -1; - } - } - return nhoff; -} - - /********************************** Hop-by-hop options. **********************************/ @@ -530,6 +468,34 @@ if (ip6_parse_tlv(tlvprochopopt_lst, skb)) return sizeof(struct ipv6hdr); return -1; +} + +/* This is fake. We have already parsed hopopts in ipv6_rcv(). -mk */ +int ipv6_hopopts_rcv(struct sk_buff **skbp) +{ + struct sk_buff *skb = *skbp; + u8 nexthdr = 0; + + if (!pskb_may_pull(skb, (skb->h.raw-skb->data)+8) || + !pskb_may_pull(skb, (skb->h.raw-skb->data)+((skb->h.raw[1]+1)<<3))) { + kfree_skb(skb); + return 0; + } + nexthdr = ((struct ipv6_hopopt_hdr *)skb->h.raw)->nexthdr; + skb->h.raw += (skb->h.raw[1]+1)<<3; + + return -nexthdr; +} + +static struct inet6_protocol hopopts_protocol = +{ + .handler = ipv6_hopopts_rcv, +}; + +void __init ipv6_hopopts_init(void) +{ + if (inet6_add_protocol(&hopopts_protocol, IPPROTO_HOPOPTS) < 0) + printk(KERN_ERR "ipv6_hopopts_init: Could not register protocol\n"); } /* Index: net/ipv6/icmp.c =================================================================== RCS file: /cvsroot/usagi/usagi-backport/linux25/net/ipv6/icmp.c,v retrieving revision 1.1.1.7 diff -u -r1.1.1.7 icmp.c --- net/ipv6/icmp.c 13 Mar 2003 17:29:06 -0000 1.1.1.7 +++ net/ipv6/icmp.c 18 Mar 2003 05:11:40 -0000 @@ -74,7 +74,7 @@ static struct socket *__icmpv6_socket[NR_CPUS]; #define icmpv6_socket __icmpv6_socket[smp_processor_id()] -static int icmpv6_rcv(struct sk_buff *skb); +static int icmpv6_rcv(struct sk_buff **pskb); static struct inet6_protocol icmpv6_protocol = { .handler = icmpv6_rcv, @@ -458,8 +458,9 @@ * Handle icmp messages */ -static int icmpv6_rcv(struct sk_buff *skb) +static int icmpv6_rcv(struct sk_buff **pskb) { + struct sk_buff *skb = *pskb; struct net_device *dev = skb->dev; struct in6_addr *saddr, *daddr; struct ipv6hdr *orig_hdr; Index: net/ipv6/ip6_input.c =================================================================== RCS file: /cvsroot/usagi/usagi-backport/linux25/net/ipv6/ip6_input.c,v retrieving revision 1.1.1.6 diff -u -r1.1.1.6 ip6_input.c --- net/ipv6/ip6_input.c 13 Mar 2003 17:29:06 -0000 1.1.1.6 +++ net/ipv6/ip6_input.c 18 Mar 2003 05:11:40 -0000 @@ -15,6 +15,10 @@ * as published by the Free Software Foundation; either version * 2 of the License, or (at your option) any later version. */ +/* Changes + * + * Mitsuru KANDA @USAGI : Remove ipv6_parse_exthdrs(). + */ #include #include @@ -127,38 +131,11 @@ struct inet6_protocol *ipprot; struct sock *raw_sk; int nhoff; - int nexthdr; + int nexthdr = hdr->nexthdr; u8 hash; skb->h.raw = skb->nh.raw + sizeof(struct ipv6hdr); - /* - * Parse extension headers - */ - - nexthdr = hdr->nexthdr; - nhoff = offsetof(struct ipv6hdr, nexthdr); - - /* Skip hop-by-hop options, they are already parsed. */ - if (nexthdr == NEXTHDR_HOP) { - nhoff = sizeof(struct ipv6hdr); - nexthdr = skb->h.raw[0]; - skb->h.raw += (skb->h.raw[1]+1)<<3; - } - - /* This check is sort of optimization. - It would be stupid to detect for optional headers, - which are missing with probability of 200% - */ - if (nexthdr != IPPROTO_TCP && nexthdr != IPPROTO_UDP && - nexthdr != NEXTHDR_AUTH && nexthdr != NEXTHDR_ESP) { - nhoff = ipv6_parse_exthdrs(&skb, nhoff); - if (nhoff < 0) - return 0; - nexthdr = skb->nh.raw[nhoff]; - hdr = skb->nh.ipv6h; - } - if (!pskb_pull(skb, skb->h.raw - skb->data)) goto discard; @@ -173,7 +150,7 @@ hash = nexthdr & (MAX_INET_PROTOS - 1); if ((ipprot = inet6_protos[hash]) != NULL) { - int ret = ipprot->handler(skb); + int ret = ipprot->handler(&skb); if (ret < 0) { nexthdr = -ret; goto resubmit; @@ -182,6 +159,7 @@ } else { if (!raw_sk) { IP6_INC_STATS_BH(Ip6InUnknownProtos); + nhoff = offsetof(struct ipv6hdr, nexthdr); icmpv6_param_prob(skb, ICMPV6_UNK_NEXTHDR, nhoff); } else { IP6_INC_STATS_BH(Ip6InDelivers); Index: net/ipv6/reassembly.c =================================================================== RCS file: /cvsroot/usagi/usagi-backport/linux25/net/ipv6/reassembly.c,v retrieving revision 1.1.1.4 diff -u -r1.1.1.4 reassembly.c --- net/ipv6/reassembly.c 20 Feb 2003 08:34:32 -0000 1.1.1.4 +++ net/ipv6/reassembly.c 18 Mar 2003 05:11:40 -0000 @@ -23,6 +23,7 @@ * Horst von Brand Add missing #include * Alexey Kuznetsov SMP races, threading, cleanup. * Patrick McHardy LRU queue of frag heads for evictor. + * Mitsuru KANDA @USAGI Register inet6_protocol{}. */ #include #include @@ -525,6 +526,7 @@ int remove_fraghdr = 0; int payload_len; int nhoff; + u8 nexthdr = 0; fq_kill(fq); @@ -535,6 +537,8 @@ payload_len = (head->data - head->nh.raw) - sizeof(struct ipv6hdr) + fq->len; nhoff = head->h.raw - head->nh.raw; + nexthdr = ((struct frag_hdr*)head->h.raw)->nexthdr; + if (payload_len > 65535) { payload_len -= 8; if (payload_len > 65535) @@ -609,9 +613,13 @@ if (head->ip_summed == CHECKSUM_HW) head->csum = csum_partial(head->nh.raw, head->h.raw-head->nh.raw, head->csum); + if (!pskb_pull(head, head->h.raw - head->data)) { + goto out_fail; + } + IP6_INC_STATS_BH(Ip6ReasmOKs); fq->fragments = NULL; - return nhoff; + return nexthdr; out_oversize: if (net_ratelimit()) @@ -622,16 +630,18 @@ printk(KERN_DEBUG "ip6_frag_reasm: no memory for reassembly\n"); out_fail: IP6_INC_STATS_BH(Ip6ReasmFails); - return -1; + return 0; } -int ipv6_reassembly(struct sk_buff **skbp, int nhoff) +int ipv6_frag_rcv(struct sk_buff **skbp) { struct sk_buff *skb = *skbp; struct net_device *dev = skb->dev; struct frag_hdr *fhdr; struct frag_queue *fq; struct ipv6hdr *hdr; + int nhoff = skb->h.raw - skb->nh.raw; + u8 nexthdr = 0; hdr = skb->nh.ipv6h; @@ -640,15 +650,16 @@ /* Jumbo payload inhibits frag. header */ if (hdr->payload_len==0) { icmpv6_param_prob(skb, ICMPV6_HDR_FIELD, skb->h.raw-skb->nh.raw); - return -1; + goto discard; } if (!pskb_may_pull(skb, (skb->h.raw-skb->data)+sizeof(struct frag_hdr))) { icmpv6_param_prob(skb, ICMPV6_HDR_FIELD, skb->h.raw-skb->nh.raw); - return -1; + goto discard; } hdr = skb->nh.ipv6h; fhdr = (struct frag_hdr *)skb->h.raw; + nexthdr = fhdr->nexthdr; if (!(fhdr->frag_off & htons(0xFFF9))) { /* It is not a fragmented frame */ @@ -674,10 +685,22 @@ spin_unlock(&fq->lock); fq_put(fq); - return ret; + return -ret; } +discard: IP6_INC_STATS_BH(Ip6ReasmFails); kfree_skb(skb); - return -1; + return 0; +} + +static struct inet6_protocol frag_protocol = +{ + .handler = ipv6_frag_rcv, +}; + +void __init ipv6_frag_init(void) +{ + if (inet6_add_protocol(&frag_protocol, IPPROTO_FRAGMENT) < 0) + printk(KERN_ERR "ipv6_frag_init: Could not register protocol\n"); } Index: net/ipv6/tcp_ipv6.c =================================================================== RCS file: /cvsroot/usagi/usagi-backport/linux25/net/ipv6/tcp_ipv6.c,v retrieving revision 1.1.1.8 diff -u -r1.1.1.8 tcp_ipv6.c --- net/ipv6/tcp_ipv6.c 13 Mar 2003 17:29:06 -0000 1.1.1.8 +++ net/ipv6/tcp_ipv6.c 18 Mar 2003 05:11:40 -0000 @@ -1591,8 +1591,9 @@ return 0; } -static int tcp_v6_rcv(struct sk_buff *skb) +static int tcp_v6_rcv(struct sk_buff **pskb) { + struct sk_buff *skb = *pskb; struct tcphdr *th; struct sock *sk; int ret; Index: net/ipv6/udp.c =================================================================== RCS file: /cvsroot/usagi/usagi-backport/linux25/net/ipv6/udp.c,v retrieving revision 1.1.1.7 diff -u -r1.1.1.7 udp.c --- net/ipv6/udp.c 13 Mar 2003 17:29:06 -0000 1.1.1.7 +++ net/ipv6/udp.c 18 Mar 2003 05:11:40 -0000 @@ -641,8 +641,9 @@ read_unlock(&udp_hash_lock); } -static int udpv6_rcv(struct sk_buff *skb) +static int udpv6_rcv(struct sk_buff **pskb) { + struct sk_buff *skb = *pskb; struct sock *sk; struct udphdr *uh; struct net_device *dev = skb->dev; From jleu@nero.doit.wisc.edu Tue Mar 18 12:52:43 2003 Received: with ECARTIS (v1.0.0; list netdev); Tue, 18 Mar 2003 12:52:51 -0800 (PST) Received: from nero.doit.wisc.edu (nero.doit.wisc.edu [128.104.17.130]) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h2IKq2q9012457 for ; Tue, 18 Mar 2003 12:52:43 -0800 Received: (from jleu@localhost) by nero.doit.wisc.edu (8.11.6/8.11.6) id h2IMk0b26157; Tue, 18 Mar 2003 16:46:00 -0600 Date: Tue, 18 Mar 2003 16:45:59 -0600 From: "James R. Leu" To: Lars Landmark Cc: netdev@oss.sgi.com Subject: Re: class/qdisc question Message-ID: <20030318164559.A26154@mindspring.com> Reply-To: jleu@mindspring.com References: Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Mailer: Mutt 1.0.1i In-Reply-To: ; from larslan@merete.zapto.org on Tue, Mar 18, 2003 at 07:22:08PM +0100 Organization: none X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) X-archive-position: 1969 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jleu@mindspring.com Precedence: bulk X-list: netdev Content-Length: 2027 Lines: 65 I suggest moving your development to user-mode-linux, then you can attach a debugger and track down the location of the lockup. http://sf.net/projects/user-mode-linux/ On Tue, Mar 18, 2003 at 07:22:08PM +0100, Lars Landmark wrote: > HI; > > I am trying to write my own class based queue. But as usual some problems > seems not to be resolved. > > I have achieved to send package through my queue. This can be done if I > not attach class or filters. If I do try to attach class or filter, my > computer stops. I can not read any message, nor do anything. My only > choice is to push power button in order to reboot. > > My "queue" is compiled as module and if I do "insmod", > it is loaded in to kernel. This operation do not report any error. > > [root@lars larslan]# /sbin/insmod sch_kll > Using /lib/modules/2.4.20/kernel/net/sched/sch_kll.o > > > /sbin/lsmod report > Module Size Used by Not tainted > sch_htb 21088 1 > sch_kll 9608 0 (autoclean) (unused) > 3c59x 28520 2 > > > When I now configure this module width my patched tc file > ************ > root@lars iproute2.lars]# ./tc/tc qdisc add dev eth0 root handle 1: kll > default 5 > ************ > output from /sbin/lsmod do not change. It still says used by 0 and > (unused). > > I have written som output in every procedyre, and dmesg report that this > procedures are called > ************* > [root@lars iproute2.lars]# dmesg > KLL: inne i classify? > KLL: inne i dequeue? > KLL: inne i dequeue? > .... > ************* > So my question is, how can this happen? > I thought that at once I configure module, > modules usage count would be incremented??? > Is there any possibility, that when I attach a filter my computer crash > because kernel do not know kll-module is in use??? > > I would be very happy if some could tell me what I have been missed... > > Any suggestion is appreciated, > Thanks in advance > > Lars > Student > > -- James R. Leu From MAILER-DAEMON@oss.sgi.com Wed Mar 19 04:48:13 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 19 Mar 2003 04:48:18 -0800 (PST) Received: from kastor.ds.pg.gda.pl (postfix@kastor.ds.pg.gda.pl [213.192.72.3]) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h2JClUq9032595 for ; Wed, 19 Mar 2003 04:48:12 -0800 Received: by kastor.ds.pg.gda.pl (Postfix, from userid 8) id D4A6B2EEB1; Wed, 19 Mar 2003 13:47:26 +0100 (CET) X-Scanned-By: Bylem tu. Amavis :) Received: from vger.kernel.org (vger.kernel.org [209.116.70.75]) by kastor.ds.pg.gda.pl (Postfix) with ESMTP id 469BE2EEAC for ; Wed, 19 Mar 2003 13:47:24 +0100 (CET) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id ; Wed, 19 Mar 2003 07:34:42 -0500 Received: (majordomo@vger.kernel.org) by vger.kernel.org id ; Wed, 19 Mar 2003 07:34:42 -0500 Received: from outpost.ds9a.nl ([213.244.168.210]:9357 "EHLO outpost.ds9a.nl") by vger.kernel.org with ESMTP id ; Wed, 19 Mar 2003 07:34:36 -0500 Received: by outpost.ds9a.nl (Postfix, from userid 1000) id B36C14508; Wed, 19 Mar 2003 13:45:33 +0100 (CET) Date: Wed, 19 Mar 2003 13:45:33 +0100 From: bert hubert To: netdev@oss.sgi.com, linux-kernel@vger.kernel.org Subject: [BUG] 2.5.65 ipv6 TCP checksum errors (capture attached) Message-ID: <20030319124533.GA14363@outpost.ds9a.nl> Mail-Followup-To: bert hubert , netdev@oss.sgi.com, linux-kernel@vger.kernel.org Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="ew6BAiZeqk4r7MaW" Content-Disposition: inline User-Agent: Mutt/1.3.28i Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) X-archive-position: 1970 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: ahu@ds9a.nl Precedence: bulk X-list: netdev Content-Length: 3966 Lines: 71 --ew6BAiZeqk4r7MaW Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Interestingly, the initial ssh connection worked, the second one failed. Subsequent attempts fail too. This all over ipv6: hubert# tcpdump -r file -v -v 29.09 snapcount.33408 > hubert.ssh: S [tcp sum ok] 2737328594:2737328594(0) win 5760 (len 40, hlim 64) 29.09 hubert.ssh > snapcount.33408: S [tcp sum ok] 2399386333:2399386333(0) ack 2737328595 win 5712 (len 40, hlim 64) 29.09 snapcount.33408 > hubert.ssh: . [tcp sum ok] 1:1(0) ack 1 win 5760 (len 32, hlim 64) 29.10 hubert.ssh > snapcount.33408: P [bad tcp cksum 4f2!] 1:41(40) ack 1 win 5712 (len 72, hlim 64) 29.30 hubert.ssh > snapcount.33408: P [bad tcp cksum 3bf1!] 1:41(40) ack 1 win 5712 (len 72, hlim 64) 29.83 hubert.ssh > snapcount.33408: P [bad tcp cksum 23ef!] 1:41(40) ack 1 win 5712 (len 72, hlim 64) 30.86 hubert.ssh > snapcount.33408: P [bad tcp cksum 23eb!] 1:41(40) ack 1 win 5712 (len 72, hlim 64) Both hosts run 2.5.65. hubert.ipv6.ds9a.nl (publically routable, so you can try to ssh to me as long as I'm not asleep, the machine is next to my bed) is a pentium pro 200. Kernel was make mrpropered before compiling, virgin kernel. Capture attached. Regards, bert -- http://www.PowerDNS.com Open source, database driven DNS Software http://lartc.org Linux Advanced Routing & Traffic Control HOWTO http://netherlabs.nl Consulting --ew6BAiZeqk4r7MaW Content-Type: application/octet-stream Content-Disposition: attachment; filename=bad-csum Content-Transfer-Encoding: base64 1MOyoQIABAAAAAAAAAAAANwFAAABAAAAiWR4Pht3AQBeAAAAXgAAAAAIoRnw8ACgzMjyXIbd YAAAAAAoBkAgAQiIEDYAAAIIof/+GfDxIAEIiBA2AAACCKH//hnw8IKAABajKFHSAAAAAKAC FoACJAAAAgQFoAQCCAoAox+fAAAAAAEDAwCJZHg+wHcBAF4AAABeAAAAAKDMyPJcAAihGfDw ht1gAAAAACgGQCABCIgQNgAAAgih//4Z8PAgAQiIEDYAAAIIof/+GfDxABaCgI8Dut2jKFHT oBIWUOuhAAACBAWgBAIICgClzBoAox+fAQMDAIlkeD5KeAEAVgAAAFYAAAAACKEZ8PAAoMzI 8lyG3WAAAAAAIAZAIAEIiBA2AAACCKH//hnw8SABCIgQNgAAAgih//4Z8PCCgAAWoyhR048D ut6AEBaAGiIAAAEBCAoAox+gAKXMGolkeD5XigEAfgAAAH4AAAAAoMzI8lwACKEZ8PCG3WAA AAAASAZAIAEIiBA2AAACCKH//hnw8CABCIgQNgAAAgih//4Z8PEAFoKAjwO63qMoUdOAGBZQ aA0AAAEBCAoApcwfAKMfoFNTSC0xLjk5LU9wZW5TU0hfMy41cDEgRGViaWFuIDE6My41cDEt NQqJZHg+7pkEAH4AAAB+AAAAAKDMyPJcAAihGfDwht1gAAAAAEgGQCABCIgQNgAAAgih//4Z 8PAgAQiIEDYAAAIIof/+GfDxABaCgI8Dut6jKFHTgBgWUGgNAAABAQgKAKXM6ACjH6BTU0gt MS45OS1PcGVuU1NIXzMuNXAxIERlYmlhbiAxOjMuNXAxLTUKiWR4Pv/GDAB+AAAAfgAAAACg zMjyXAAIoRnw8IbdYAAAAABIBkAgAQiIEDYAAAIIof/+GfDwIAEIiBA2AAACCKH//hnw8QAW goCPA7reoyhR04AYFlBoDQAAAQEICgClzwAAox+gU1NILTEuOTktT3BlblNTSF8zLjVwMSBE ZWJpYW4gMTozLjVwMS01CopkeD4pJA0AfgAAAH4AAAAAoMzI8lwACKEZ8PCG3WAAAAAASAZA IAEIiBA2AAACCKH//hnw8CABCIgQNgAAAgih//4Z8PEAFoKAjwO63qMoUdOAGBZQaA0AAAEB CAoApdMAAKMfoFNTSC0xLjk5LU9wZW5TU0hfMy41cDEgRGViaWFuIDE6My41cDEtNQqMZHg+ 0/QJAH4AAAB+AAAAAKDMyPJcAAihGfDwht1gAAAAAEgGQCABCIgQNgAAAgih//4Z8PAgAQiI EDYAAAIIof/+GfDxABaCgI8Dut6jKFHTgBgWUGgNAAABAQgKAKXaAACjH6BTU0gtMS45OS1P cGVuU1NIXzMuNXAxIERlYmlhbiAxOjMuNXAxLTUKj2R4PkXxDgB+AAAAfgAAAACgzMjyXAAI oRnw8IbdYAAAAABIBkAgAQiIEDYAAAIIof/+GfDwIAEIiBA2AAACCKH//hnw8QAWgoCPA7re oyhR04AYFlBoDQAAAQEICgCl5wAAox+gU1NILTEuOTktT3BlblNTSF8zLjVwMSBEZWJpYW4g MTozLjVwMS01CpZkeD6mqAkAfgAAAH4AAAAAoMzI8lwACKEZ8PCG3WAAAAAASAZAIAEIiBA2 AAACCKH//hnw8CABCIgQNgAAAgih//4Z8PEAFoKAjwO63qMoUdOAGBZQaA0AAAEBCAoApgEA AKMfoFNTSC0xLjk5LU9wZW5TU0hfMy41cDEgRGViaWFuIDE6My41cDEtNQo= --ew6BAiZeqk4r7MaW-- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ From archie@precisionio.com Wed Mar 19 14:09:22 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 19 Mar 2003 14:09:31 -0800 (PST) Received: from mailman.precisionio.com (www.precisionio.com [65.192.41.225]) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h2JM8eq9022769 for ; Wed, 19 Mar 2003 14:09:22 -0800 Received: from bubba.precisionio.com (bubba.precisionio.com [172.16.0.223]) by mailman.precisionio.com (8.12.6/8.12.6) with ESMTP id h2JM8YWV043716; Wed, 19 Mar 2003 14:08:34 -0800 (PST) (envelope-from archie@precisionio.com) Received: from bubba.precisionio.com (localhost [127.0.0.1]) by bubba.precisionio.com (8.12.7/8.12.7) with ESMTP id h2JM8YoX037111; Wed, 19 Mar 2003 14:08:34 -0800 (PST) (envelope-from archie@bubba.precisionio.com) Received: (from archie@localhost) by bubba.precisionio.com (8.12.7/8.12.7/Submit) id h2JM8Yx1037110; Wed, 19 Mar 2003 14:08:34 -0800 (PST) From: Archie Cobbs Message-Id: <200303192208.h2JM8Yx1037110@bubba.precisionio.com> Subject: [PATCH] sk_buff's allocated from private pools To: netdev@oss.sgi.com Date: Wed, 19 Mar 2003 14:08:34 -0800 (PST) CC: Archie Cobbs X-Mailer: ELM [version 2.4ME+ PL82 (25)] MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset=US-ASCII X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) X-archive-position: 1971 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: archie@precisionio.com Precedence: bulk X-list: netdev Content-Length: 5910 Lines: 202 Hello, I'm submitting this patch for inclusion in the Linux kernel if deemed generally useful. The purpose of this patch is to add a new function called alloc_skb_custom() (or whatever) that allows the data portion of an sk_buff to reside in any memory region, not just a region returned by kmalloc(). For example, if a networking device has a restriction on where receive buffers may reside, then the device driver can avoid copying every incoming packet if it is able to create an sk_buff that points to the receive buffer memory. Basically this amounts to adding a 'free_data' function pointer to the sk_buff structure. By default this points to kfree() but in general could point to anywhere. FYI FreeBSD has had equivalent functionality in its 'struct mbuf' for many years (I'm also a FreeBSD developer). Thanks for your review. Cheers, -Archie __________________________________________________________________________ Archie Cobbs * Precision I/O * http://www.precisionio.com Index: include/linux/skbuff.h =================================================================== RCS file: /home/cvs/linux-2.4.20/include/linux/skbuff.h,v retrieving revision 1.1.1.1 retrieving revision 1.2 diff -u -r1.1.1.1 -r1.2 --- include/linux/skbuff.h 3 Jan 2003 22:31:40 -0000 1.1.1.1 +++ include/linux/skbuff.h 15 Mar 2003 01:13:35 -0000 1.2 @@ -193,6 +193,7 @@ unsigned char *tail; /* Tail pointer */ unsigned char *end; /* End pointer */ + void (*free_data)(const void *); /* Free data buffer function */ void (*destructor)(struct sk_buff *); /* Destruct function */ #ifdef CONFIG_NETFILTER /* Can be used for communication between hooks. */ @@ -230,6 +231,8 @@ extern void __kfree_skb(struct sk_buff *skb); extern struct sk_buff * alloc_skb(unsigned int size, int priority); +extern struct sk_buff * alloc_skb_custom(unsigned int size, int priority, + void (*free_data)(const void *), u8 *data); extern void kfree_skbmem(struct sk_buff *skb); extern struct sk_buff * skb_clone(struct sk_buff *skb, int priority); extern struct sk_buff * skb_copy(const struct sk_buff *skb, int priority); Index: net/netsyms.c =================================================================== RCS file: /home/cvs/linux-2.4.20/net/netsyms.c,v retrieving revision 1.1.1.1 retrieving revision 1.2 diff -u -r1.1.1.1 -r1.2 --- net/netsyms.c 3 Jan 2003 22:31:42 -0000 1.1.1.1 +++ net/netsyms.c 15 Mar 2003 01:13:35 -0000 1.2 @@ -489,6 +489,7 @@ EXPORT_SYMBOL(eth_copy_and_sum); #endif EXPORT_SYMBOL(alloc_skb); +EXPORT_SYMBOL(alloc_skb_custom); EXPORT_SYMBOL(__kfree_skb); EXPORT_SYMBOL(skb_clone); EXPORT_SYMBOL(skb_copy); Index: net/core/skbuff.c =================================================================== RCS file: /home/cvs/linux-2.4.20/net/core/skbuff.c,v retrieving revision 1.1 retrieving revision 1.3 diff -u -r1.1 -r1.3 --- net/core/skbuff.c 7 Jan 2003 00:35:25 -0000 1.1 +++ net/core/skbuff.c 18 Mar 2003 23:14:54 -0000 1.3 @@ -149,7 +149,7 @@ */ /** - * alloc_skb - allocate a network buffer + * alloc_skb - allocate a network buffer using kmalloc * @size: size to allocate * @gfp_mask: allocation mask * @@ -169,8 +169,46 @@ if (in_interrupt() && (gfp_mask & __GFP_WAIT)) { static int count = 0; if (++count < 5) { - printk(KERN_ERR "alloc_skb called nonatomically " - "from interrupt %p\n", NET_CALLER(size)); + printk(KERN_ERR "%s called nonatomically from " + "interrupt %p\n", "alloc_skb", NET_CALLER(size)); + BUG(); + } + gfp_mask &= ~__GFP_WAIT; + } + + /* Get the DATA. Size must match skb_add_mtu(). */ + size = SKB_DATA_ALIGN(size); + data = kmalloc(size + sizeof(struct skb_shared_info), gfp_mask); + if (data == NULL) + return NULL; + + /* Allocate the rest of the skb */ + if ((skb = alloc_skb_custom(size, gfp_mask, kfree, data)) == NULL) + kfree(data); + + /* Done */ + return skb; +} + +/** + * alloc_skb_custom - allocate a network buffer + * using the supplied data area + * + * This assumes that size is aligned via SKB_DATA_ALIGN(), and + * that 'data' points to size + sizeof(struct skb_shared_info) + * bytes. + */ + +struct sk_buff *alloc_skb_custom(unsigned int size, int gfp_mask, + void (*free_data)(const void *), u8 *data) +{ + struct sk_buff *skb; + + if (in_interrupt() && (gfp_mask & __GFP_WAIT)) { + static int count = 0; + if (++count < 5) { + printk(KERN_ERR "%s called nonatomically from " + "interrupt %p\n", "alloc_skb_custom", NET_CALLER(size)); BUG(); } gfp_mask &= ~__GFP_WAIT; @@ -184,11 +222,9 @@ goto nohead; } - /* Get the DATA. Size must match skb_add_mtu(). */ - size = SKB_DATA_ALIGN(size); - data = kmalloc(size + sizeof(struct skb_shared_info), gfp_mask); - if (data == NULL) - goto nodata; + /* Size must match skb_add_mtu(). */ + if (size != SKB_DATA_ALIGN(size)) + BUG(); /* XXX: does not include slab overhead */ skb->truesize = size + sizeof(struct sk_buff); @@ -198,6 +234,7 @@ skb->data = data; skb->tail = data; skb->end = data + size; + skb->free_data = free_data; /* Set up other state */ skb->len = 0; @@ -210,8 +247,6 @@ skb_shinfo(skb)->frag_list = NULL; return skb; -nodata: - skb_head_to_pool(skb); nohead: return NULL; } @@ -285,7 +320,10 @@ if (skb_shinfo(skb)->frag_list) skb_drop_fraglist(skb); - kfree(skb->head); + if (skb->free_data == NULL) + BUG(); + (*skb->free_data)(skb->head); + skb->free_data = NULL; } } @@ -384,6 +422,7 @@ C(tail); C(end); n->destructor = NULL; + C(free_data); #ifdef CONFIG_NETFILTER C(nfmark); C(nfcache); @@ -520,6 +559,7 @@ skb->head = data; skb->end = data + size; + skb->free_data = kfree; /* Set up new pointers */ skb->h.raw += offset; @@ -647,6 +687,7 @@ skb->head = data; skb->end = data+size; + skb->free_data = kfree; skb->data += off; skb->tail += off; From davem@redhat.com Wed Mar 19 16:33:23 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 19 Mar 2003 16:33:31 -0800 (PST) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h2K0XLq9025811 for ; Wed, 19 Mar 2003 16:33:22 -0800 Received: from localhost (IDENT:davem@localhost.localdomain [127.0.0.1]) by pizda.ninka.net (8.9.3/8.9.3) with ESMTP id QAA12026; Wed, 19 Mar 2003 16:31:06 -0800 Date: Wed, 19 Mar 2003 16:31:05 -0800 (PST) Message-Id: <20030319.163105.44963500.davem@redhat.com> To: dlstevens@us.ibm.com Cc: kuznet@ms2.inr.ac.ru, linux-kernel@vger.kernel.org, netdev@oss.sgi.com Subject: Re: [PATCH] anycast support for IPv6, updated to 2.5.44 From: "David S. Miller" In-Reply-To: References: X-FalunGong: Information control. X-Mailer: Mew version 2.1 on Emacs 21.1 / Mule 5.0 (SAKAKI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) X-archive-position: 1972 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev Content-Length: 444 Lines: 12 From: "David Stevens" Date: Mon, 28 Oct 2002 14:06:00 -0700 Below is a patch to add anycast support for IPv6. It's the same patch as I've posted previously, but updated with comments from Chris Hellwig and for kernel version 2.5.44. I'm going to apply this, with the small change that dev_getany() is renamed to dev_get_by_flags() which more accurately describes what the routine does. Thanks David. From yoshfuji@wide.ad.jp Wed Mar 19 19:01:21 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 19 Mar 2003 19:01:32 -0800 (PST) Received: from yue.hongo.wide.ad.jp (yue.hongo.wide.ad.jp [203.178.139.94]) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h2K31Jq9028566 for ; Wed, 19 Mar 2003 19:01:21 -0800 Received: from localhost (localhost [127.0.0.1]) by yue.hongo.wide.ad.jp (8.12.3+3.5Wbeta/8.12.3/Debian-5) with ESMTP id h2K31aUl005092; Thu, 20 Mar 2003 12:01:37 +0900 Date: Thu, 20 Mar 2003 12:01:36 +0900 (JST) Message-Id: <20030320.120136.108400165.yoshfuji@wide.ad.jp> To: davem@redhat.com Cc: dlstevens@us.ibm.com, kuznet@ms2.inr.ac.ru, linux-kernel@vger.kernel.org, netdev@oss.sgi.com Subject: Re: [PATCH] anycast support for IPv6, updated to 2.5.44 From: YOSHIFUJI Hideaki / =?iso-2022-jp?B?GyRCNUhGIzFRTEAbKEI=?= In-Reply-To: <20030319.163105.44963500.davem@redhat.com> References: <20030319.163105.44963500.davem@redhat.com> X-URL: http://www.yoshifuji.org/%7Ehideaki/ X-Fingerprint: 90 22 65 EB 1E CF 3A D1 0B DF 80 D8 48 07 F8 94 E0 62 0E EA X-PGP-Key-URL: http://www.yoshifuji.org/%7Ehideaki/hideaki@yoshifuji.org.asc X-Mailer: Mew version 2.2 on Emacs 20.7 / Mule 4.1 (AOI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) X-archive-position: 1973 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: yoshfuji@wide.ad.jp Precedence: bulk X-list: netdev Content-Length: 1059 Lines: 25 In article <20030319.163105.44963500.davem@redhat.com> (at Wed, 19 Mar 2003 16:31:05 -0800 (PST)), "David S. Miller" says: > From: "David Stevens" > Date: Mon, 28 Oct 2002 14:06:00 -0700 > > Below is a patch to add anycast support for IPv6. It's the same patch as > I've posted previously, but updated with comments from Chris Hellwig and > for kernel version 2.5.44. > > I'm going to apply this, with the small change that dev_getany() is > renamed to dev_get_by_flags() which more accurately describes > what the routine does. Again: I don't like API at all. Anycast address management itself in that patch would be ok. However, JOIN/LEAVE is NOT useful and userland application will be incompatible with other implementation. (sigh...) I think linux likes unicast model (assign address like unicast address), too. And, we see __constant_{hton,ntoh}{l,h}() again... -- Hideaki YOSHIFUJI @ USAGI Project GPG FP: 9022 65EB 1ECF 3AD1 0BDF 80D8 4807 F894 E062 0EEA From davem@redhat.com Wed Mar 19 19:25:42 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 19 Mar 2003 19:25:45 -0800 (PST) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h2K3Pfq9029060 for ; Wed, 19 Mar 2003 19:25:42 -0800 Received: from localhost (IDENT:davem@localhost.localdomain [127.0.0.1]) by pizda.ninka.net (8.9.3/8.9.3) with ESMTP id TAA12333; Wed, 19 Mar 2003 19:23:32 -0800 Date: Wed, 19 Mar 2003 19:23:31 -0800 (PST) Message-Id: <20030319.192331.95884882.davem@redhat.com> To: yoshfuji@wide.ad.jp Cc: dlstevens@us.ibm.com, kuznet@ms2.inr.ac.ru, linux-kernel@vger.kernel.org, netdev@oss.sgi.com Subject: Re: [PATCH] anycast support for IPv6, updated to 2.5.44 From: "David S. Miller" In-Reply-To: <20030320.120136.108400165.yoshfuji@wide.ad.jp> References: <20030319.163105.44963500.davem@redhat.com> <20030320.120136.108400165.yoshfuji@wide.ad.jp> X-FalunGong: Information control. X-Mailer: Mew version 2.1 on Emacs 21.1 / Mule 5.0 (SAKAKI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=iso-2022-jp Content-Transfer-Encoding: 7bit X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) X-archive-position: 1974 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev Content-Length: 948 Lines: 22 From: YOSHIFUJI Hideaki / $B5HF#1QL@(B Date: Thu, 20 Mar 2003 12:01:36 +0900 (JST) In article <20030319.163105.44963500.davem@redhat.com> (at Wed, 19 Mar 2003 16:31:05 -0800 (PST)), "David S. Miller" says: > I'm going to apply this, with the small change that dev_getany() is > renamed to dev_get_by_flags() which more accurately describes > what the routine does. Again: I don't like API at all. Anycast address management itself in that patch would be ok. However, JOIN/LEAVE is NOT useful and userland application will be incompatible with other implementation. (sigh...) I think linux likes unicast model (assign address like unicast address), too. Please propose alternative API, or do you suggest not to export this facility to user at all? And, we see __constant_{hton,ntoh}{l,h}() again... I will fix this, thank you for mentioning this. From yoshfuji@wide.ad.jp Wed Mar 19 19:44:11 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 19 Mar 2003 19:44:14 -0800 (PST) Received: from yue.hongo.wide.ad.jp (yue.hongo.wide.ad.jp [203.178.139.94]) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h2K3i9q9029478 for ; Wed, 19 Mar 2003 19:44:11 -0800 Received: from localhost (localhost [127.0.0.1]) by yue.hongo.wide.ad.jp (8.12.3+3.5Wbeta/8.12.3/Debian-5) with ESMTP id h2K3iTUl005429; Thu, 20 Mar 2003 12:44:29 +0900 Date: Thu, 20 Mar 2003 12:44:28 +0900 (JST) Message-Id: <20030320.124428.95965257.yoshfuji@wide.ad.jp> To: davem@redhat.com Cc: dlstevens@us.ibm.com, kuznet@ms2.inr.ac.ru, linux-kernel@vger.kernel.org, netdev@oss.sgi.com Subject: Re: [PATCH] anycast support for IPv6, updated to 2.5.44 From: YOSHIFUJI Hideaki / =?iso-2022-jp?B?GyRCNUhGIzFRTEAbKEI=?= In-Reply-To: <20030319.192331.95884882.davem@redhat.com> References: <20030319.163105.44963500.davem@redhat.com> <20030320.120136.108400165.yoshfuji@wide.ad.jp> <20030319.192331.95884882.davem@redhat.com> X-URL: http://www.yoshifuji.org/%7Ehideaki/ X-Fingerprint: 90 22 65 EB 1E CF 3A D1 0B DF 80 D8 48 07 F8 94 E0 62 0E EA X-PGP-Key-URL: http://www.yoshifuji.org/%7Ehideaki/hideaki@yoshifuji.org.asc X-Mailer: Mew version 2.2 on Emacs 20.7 / Mule 4.1 (AOI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) X-archive-position: 1975 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: yoshfuji@wide.ad.jp Precedence: bulk X-list: netdev Content-Length: 1131 Lines: 25 In article <20030319.192331.95884882.davem@redhat.com> (at Wed, 19 Mar 2003 19:23:31 -0800 (PST)), "David S. Miller" says: > > I'm going to apply this, with the small change that dev_getany() is > > renamed to dev_get_by_flags() which more accurately describes > > what the routine does. > > Again: I don't like API at all. > > Anycast address management itself in that patch would be ok. > However, JOIN/LEAVE is NOT useful and userland application will be > incompatible with other implementation. (sigh...) > I think linux likes unicast model (assign address like unicast address), too. > > Please propose alternative API, or do you suggest not > to export this facility to user at all? I like to assign address like unicast (using ioctl and rtnetlink (RTN_ANYCAST)). We suggest you not exporting this facilicy until finishing new API (And, another API would be standardized; This is another reason why I am against exporting that API for now.) -- Hideaki YOSHIFUJI @ USAGI Project GPG FP: 9022 65EB 1ECF 3AD1 0BDF 80D8 4807 F894 E062 0EEA From davem@redhat.com Wed Mar 19 19:49:49 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 19 Mar 2003 19:49:53 -0800 (PST) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h2K3nlq9029816 for ; Wed, 19 Mar 2003 19:49:49 -0800 Received: from localhost (IDENT:davem@localhost.localdomain [127.0.0.1]) by pizda.ninka.net (8.9.3/8.9.3) with ESMTP id TAA12413; Wed, 19 Mar 2003 19:47:35 -0800 Date: Wed, 19 Mar 2003 19:47:35 -0800 (PST) Message-Id: <20030319.194735.31799019.davem@redhat.com> To: yoshfuji@wide.ad.jp Cc: dlstevens@us.ibm.com, kuznet@ms2.inr.ac.ru, linux-kernel@vger.kernel.org, netdev@oss.sgi.com Subject: Re: [PATCH] anycast support for IPv6, updated to 2.5.44 From: "David S. Miller" In-Reply-To: <20030320.124428.95965257.yoshfuji@wide.ad.jp> References: <20030320.120136.108400165.yoshfuji@wide.ad.jp> <20030319.192331.95884882.davem@redhat.com> <20030320.124428.95965257.yoshfuji@wide.ad.jp> X-FalunGong: Information control. X-Mailer: Mew version 2.1 on Emacs 21.1 / Mule 5.0 (SAKAKI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=iso-2022-jp Content-Transfer-Encoding: 7bit X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) X-archive-position: 1976 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev Content-Length: 847 Lines: 19 From: YOSHIFUJI Hideaki / $B5HF#1QL@(B Date: Thu, 20 Mar 2003 12:44:28 +0900 (JST) In article <20030319.192331.95884882.davem@redhat.com> (at Wed, 19 Mar 2003 19:23:31 -0800 (PST)), "David S. Miller" says: > Please propose alternative API, or do you suggest not > to export this facility to user at all? I like to assign address like unicast (using ioctl and rtnetlink (RTN_ANYCAST)). We suggest you not exporting this facilicy until finishing new API (And, another API would be standardized; This is another reason why I am against exporting that API for now.) I think anycast addresses are more like multicast than unicast. Do you agree about this? But here is what really matters, does the advanced IPV6 socket API say anything about a user API for anycast? From nalkunda@cse.msu.edu Wed Mar 19 20:57:25 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 19 Mar 2003 20:57:29 -0800 (PST) Received: from sargasso.cse.msu.edu (sargasso.cse.msu.edu [35.9.20.10]) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h2K4vNq9030458 for ; Wed, 19 Mar 2003 20:57:24 -0800 Received: from elans-pc.cse.msu.edu (elans.cse.msu.edu [35.9.43.164]) by sargasso.cse.msu.edu (8.12.8/8.12.8) with ESMTP id h2K4vHhA003182; Wed, 19 Mar 2003 23:57:17 -0500 (EST) Content-Type: text/plain; charset="us-ascii" From: N N Ashok To: netdev@oss.sgi.com, linux-net@vger.kernel.org Subject: Casting (struct rtable*) to (struct dst_entry*) Date: Wed, 19 Mar 2003 23:55:02 -0500 User-Agent: KMail/1.4.3 MIME-Version: 1.0 Message-Id: <200303192355.02509.nalkunda@cse.msu.edu> Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by oss.sgi.com id h2K4vNq9030458 X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) X-archive-position: 1977 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: nalkunda@cse.msu.edu Precedence: bulk X-list: netdev Content-Length: 881 Lines: 25 Hi, I have been looking at the networking code of Linux for my Masters thesis. I observed the following: In ip_route_input(), if a route is found in the cache, the skb->dst is setup with the route found by casting the rtable entry to dst_entry: skb->dst = (struct dst_entry*)rth; Later in ip_route_input(), skb->dst->input() is called: return skb->dst->input(skb); In ip_forward(), skb->dst is again casted to rtable: rt = (struct rtable*)skb->dst; I am unable to understand how a rtable structure casted to dst_entry will give a correct pointer to the input() function. I looked at the fields in rtable and dst_entry, the fields in the structures are cannot be lined up (the fourth field in rtable is not the same type as the fourth field in dst_entry). Can anybody help me understand this casting of rtable to dst_entry and then back to rtable? Thanks, Ashok From davem@redhat.com Wed Mar 19 21:00:47 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 19 Mar 2003 21:00:54 -0800 (PST) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h2K50kq9030993 for ; Wed, 19 Mar 2003 21:00:46 -0800 Received: from localhost (IDENT:davem@localhost.localdomain [127.0.0.1]) by pizda.ninka.net (8.9.3/8.9.3) with ESMTP id UAA12505; Wed, 19 Mar 2003 20:59:13 -0800 Date: Wed, 19 Mar 2003 20:59:12 -0800 (PST) Message-Id: <20030319.205912.131928327.davem@redhat.com> To: nalkunda@cse.msu.edu Cc: netdev@oss.sgi.com, linux-net@vger.kernel.org Subject: Re: Casting (struct rtable*) to (struct dst_entry*) From: "David S. Miller" In-Reply-To: <200303192355.02509.nalkunda@cse.msu.edu> References: <200303192355.02509.nalkunda@cse.msu.edu> X-FalunGong: Information control. X-Mailer: Mew version 2.1 on Emacs 21.1 / Mule 5.0 (SAKAKI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) X-archive-position: 1978 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev Content-Length: 463 Lines: 10 From: N N Ashok Date: Wed, 19 Mar 2003 23:55:02 -0500 I am unable to understand how a rtable structure casted to dst_entry will give a correct pointer to the input() function. I looked at the fields in rtable and dst_entry, the fields in the structures are cannot be lined up (the fourth field in rtable is not the same type as the fourth field in dst_entry). "struct rtable" starts with a "struct dst_entry" From nalkunda@cse.msu.edu Wed Mar 19 21:30:04 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 19 Mar 2003 21:30:16 -0800 (PST) Received: from sargasso.cse.msu.edu (sargasso.cse.msu.edu [35.9.20.10]) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h2K5U3q9031959 for ; Wed, 19 Mar 2003 21:30:04 -0800 Received: from elans-pc.cse.msu.edu (elans.cse.msu.edu [35.9.43.164]) by sargasso.cse.msu.edu (8.12.8/8.12.8) with ESMTP id h2K5TvhA006821; Thu, 20 Mar 2003 00:29:57 -0500 (EST) Content-Type: text/plain; charset="iso-8859-1" From: N N Ashok To: "David S. Miller" Subject: Re: Casting (struct rtable*) to (struct dst_entry*) Date: Thu, 20 Mar 2003 00:27:41 -0500 User-Agent: KMail/1.4.3 Cc: netdev@oss.sgi.com, linux-net@vger.kernel.org References: <200303192355.02509.nalkunda@cse.msu.edu> <20030319.205912.131928327.davem@redhat.com> In-Reply-To: <20030319.205912.131928327.davem@redhat.com> MIME-Version: 1.0 Message-Id: <200303200027.41923.nalkunda@cse.msu.edu> Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by oss.sgi.com id h2K5U3q9031959 X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) X-archive-position: 1979 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: nalkunda@cse.msu.edu Precedence: bulk X-list: netdev Content-Length: 1417 Lines: 32 On Wednesday 19 March 2003 23:59, David S. Miller wrote: > From: N N Ashok > Date: Wed, 19 Mar 2003 23:55:02 -0500 > > I am unable to understand how a rtable structure casted to dst_entry > will give a correct pointer to the input() function. I looked at the fields > in rtable and dst_entry, the fields in the structures are cannot be lined > up (the fourth field in rtable is not the same type as the fourth field in > dst_entry). > > "struct rtable" starts with a "struct dst_entry" Thanks David. I did see that. But however, I could not understand how "struct rtable" can be casted to "struct dst_entry" and then back again, all the while accessing fields of both structures. When the (struct rtable *)rth is filled in ip_route_input(), the variables accessed are those of rtable. Then rth is cast to (struct dst_entry *) and assigned to skb->dst (which is of type struct dst_entry *). After this, in ip_rcv_finish(), the field of dst_entry is accessed as in: skb->dst->input(). I am unable to understand how, data filled in as rtable fields will be valid when accessed as dst_entry fields. Later in ip_forward() (for a packet to be forwarded), the skb->dst is cast to (struct rtable *) and its fields accessed. A correction to the previous post: the skb->dst->input() is invoked in ip_rcv_finish() and not in ip_route_input() as mentioned in the post. Thanks, Ashok From davem@redhat.com Wed Mar 19 21:36:06 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 19 Mar 2003 21:36:11 -0800 (PST) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h2K5ZPq9032346 for ; Wed, 19 Mar 2003 21:36:06 -0800 Received: from localhost (IDENT:davem@localhost.localdomain [127.0.0.1]) by pizda.ninka.net (8.9.3/8.9.3) with ESMTP id VAA12634; Wed, 19 Mar 2003 21:33:52 -0800 Date: Wed, 19 Mar 2003 21:33:52 -0800 (PST) Message-Id: <20030319.213352.50358237.davem@redhat.com> To: nalkunda@cse.msu.edu Cc: netdev@oss.sgi.com, linux-net@vger.kernel.org Subject: Re: Casting (struct rtable*) to (struct dst_entry*) From: "David S. Miller" In-Reply-To: <200303200027.41923.nalkunda@cse.msu.edu> References: <200303192355.02509.nalkunda@cse.msu.edu> <20030319.205912.131928327.davem@redhat.com> <200303200027.41923.nalkunda@cse.msu.edu> X-FalunGong: Information control. X-Mailer: Mew version 2.1 on Emacs 21.1 / Mule 5.0 (SAKAKI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) X-archive-position: 1980 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev Content-Length: 637 Lines: 24 From: N N Ashok Date: Thu, 20 Mar 2003 00:27:41 -0500 I did see that. But however, I could not understand how "struct rtable" can be casted to "struct dst_entry" and then back again, all the while accessing fields of both structures. You miss the point that they are the same structure. It is allocated the size of "struct rtable" but it may be casted back and forth between rtable and dst_entry as desired. void foo(void) { struct rtable rt; struct dst_entry *dst; rt->u.dst.bar = 1; dst = (struct dst_entry *) &rt; ASSERT(dst->bar == 1); dst = &rt->u.dst; ASSERT(dst->bar == 1); } From nalkunda@cse.msu.edu Wed Mar 19 22:05:40 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 19 Mar 2003 22:05:43 -0800 (PST) Received: from sargasso.cse.msu.edu (sargasso.cse.msu.edu [35.9.20.10]) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h2K650q9000310 for ; Wed, 19 Mar 2003 22:05:40 -0800 Received: from elans-pc.cse.msu.edu (elans.cse.msu.edu [35.9.43.164]) by sargasso.cse.msu.edu (8.12.8/8.12.8) with ESMTP id h2K64shA010265; Thu, 20 Mar 2003 01:04:54 -0500 (EST) Content-Type: text/plain; charset="iso-8859-1" From: N N Ashok To: "David S. Miller" Subject: Re: Casting (struct rtable*) to (struct dst_entry*) Date: Thu, 20 Mar 2003 01:02:38 -0500 User-Agent: KMail/1.4.3 Cc: netdev@oss.sgi.com, linux-net@vger.kernel.org References: <200303192355.02509.nalkunda@cse.msu.edu> <200303200027.41923.nalkunda@cse.msu.edu> <20030319.213352.50358237.davem@redhat.com> In-Reply-To: <20030319.213352.50358237.davem@redhat.com> MIME-Version: 1.0 Message-Id: <200303200102.38813.nalkunda@cse.msu.edu> Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by oss.sgi.com id h2K650q9000310 X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) X-archive-position: 1981 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: nalkunda@cse.msu.edu Precedence: bulk X-list: netdev Content-Length: 1221 Lines: 38 On Thursday 20 March 2003 00:33, David S. Miller wrote: > From: N N Ashok > Date: Thu, 20 Mar 2003 00:27:41 -0500 > > I did see that. But however, I could not understand how "struct rtable" > can be casted to "struct dst_entry" and then back again, all the while > accessing fields of both structures. > > You miss the point that they are the same structure. It is allocated > the size of "struct rtable" but it may be casted back and forth > between rtable and dst_entry as desired. > > void foo(void) > { > struct rtable rt; > struct dst_entry *dst; > > rt->u.dst.bar = 1; > > dst = (struct dst_entry *) &rt; > ASSERT(dst->bar == 1); > > dst = &rt->u.dst; > ASSERT(dst->bar == 1); > } I think I finally understand the whole setup. Please correct if I'm wrong. "struct rtable" has its first field "u" which is a union of "dst_entry" and "struct rtable *". Thus when we cast rtable to dst_entry, we are accessing the rtable.u.dst_entry itself and not any other part of rtable. Since originally the data was allocated the size of "rtable", when we cast "dst_entry" to "struct rtable" we can access all the fields of "struct table". Thanks a lot for the clarification. Ashok From dlstevens@us.ibm.com Wed Mar 19 23:34:53 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 19 Mar 2003 23:35:00 -0800 (PST) Received: from e32.co.us.ibm.com (e32.co.us.ibm.com [32.97.110.130]) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h2K7Ypq9001312 for ; Wed, 19 Mar 2003 23:34:53 -0800 Received: from westrelay04.boulder.ibm.com (westrelay04.boulder.ibm.com [9.17.193.32]) by e32.co.us.ibm.com (8.12.8/8.12.2) with ESMTP id h2K7XqTr046592; Thu, 20 Mar 2003 02:33:52 -0500 Received: from d03nm121.boulder.ibm.com (d03av02.boulder.ibm.com [9.17.193.82]) by westrelay04.boulder.ibm.com (8.12.8/NCO/VER6.5) with ESMTP id h2K7YiZt223914; Thu, 20 Mar 2003 00:34:44 -0700 Importance: Normal Sensitivity: Subject: Re: [PATCH] anycast support for IPv6, updated to 2.5.44 To: "David S. Miller" Cc: yoshfuji@wide.ad.jp, kuznet@ms2.inr.ac.ru, linux-kernel@vger.kernel.org, netdev@oss.sgi.com X-Mailer: Lotus Notes Release 5.0.4a July 24, 2000 Message-ID: From: David Stevens Date: Thu, 20 Mar 2003 00:34:41 -0700 X-MIMETrack: Serialize by Router on D03NM121/03/M/IBM(Release 6.0 [IBM]|December 16, 2002) at 03/20/2003 00:34:43 MIME-Version: 1.0 Content-type: text/plain; charset=US-ASCII X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) X-archive-position: 1982 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: dlstevens@us.ibm.com Precedence: bulk X-list: netdev Content-Length: 1009 Lines: 23 Yoshifuji, I created the multicast-like API because, aside from the in-kernel use, there was no way to use anycasting otherwise, and I believe for at least the high-availability case, it doesn't make any sense to treat it like a unicast address. An exited DNS server program with the server machine still up will in fact deny service to clients that might otherwise find a working server if the "permanent" address model were not there. With a multicast-like interface at least available, programs have the choice of tying the anycast address to whether or not the service that needs it is running. That said, there's no reason why you can't have both, and that's straightforward with the code (but not implemented). I think it's too early to be concerned with compatibility since there is no alternative non-permanent anycast address API. If Linux has an API to do something that can't be done at all on other systems, there clearly isn't a portability issue. +-DLS From nalkunda@cse.msu.edu Thu Mar 20 00:34:45 2003 Received: with ECARTIS (v1.0.0; list netdev); Thu, 20 Mar 2003 00:34:51 -0800 (PST) Received: from sargasso.cse.msu.edu (sargasso.cse.msu.edu [35.9.20.10]) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h2K8Yiq9006643 for ; Thu, 20 Mar 2003 00:34:45 -0800 Received: from elans-pc.cse.msu.edu (elans.cse.msu.edu [35.9.43.164]) by sargasso.cse.msu.edu (8.12.8/8.12.8) with ESMTP id h2K8YchA024595; Thu, 20 Mar 2003 03:34:38 -0500 (EST) Content-Type: text/plain; charset="us-ascii" From: N N Ashok To: netdev@oss.sgi.com, linux-net@vger.kernel.org Subject: Keeping track of an interface Date: Thu, 20 Mar 2003 03:32:22 -0500 User-Agent: KMail/1.4.3 MIME-Version: 1.0 Message-Id: <200303200332.22747.nalkunda@cse.msu.edu> Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by oss.sgi.com id h2K8Yiq9006643 X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) X-archive-position: 1983 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: nalkunda@cse.msu.edu Precedence: bulk X-list: netdev Content-Length: 1129 Lines: 23 Hi, I have a situation where I want to record which interface (say eth2) was used for a packet (say the current packet). Later I want to be able to send other packets over the interface which I recorded if it is up and able to send packets. What would be the best way to keep track of an interface for the above scenario? Could we use the "struct net_device" pointer to the interface? If we kept track of "struct net_device" pointer, and later that interface flapped, the pointer would no longer be valid (I assume so). Then how should do we detect that? One way I thought was that we keep track of the "oif" (in "struct fib_nh") for the particular device. Is it guaranteed that the same device will always have the same "oif" even if it flapped? If so, it would be a simple matter to record the "oif" number of the device and later send the packets on that interface after checking that the interface is up and able to send packets. The broader aim is to route packets of a connection over the same interface, like routing all packets of the same TCP session over the same interface. Thanks, Ashok From Robert.Olsson@data.slu.se Thu Mar 20 01:33:31 2003 Received: with ECARTIS (v1.0.0; list netdev); Thu, 20 Mar 2003 01:33:36 -0800 (PST) Received: from robur.slu.se (robur.slu.se [130.238.98.12]) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h2K9XSq9008073 for ; Thu, 20 Mar 2003 01:33:30 -0800 Received: (from robert@localhost) by robur.slu.se (8.9.3/8.9.3) id KAA25659; Thu, 20 Mar 2003 10:33:21 +0100 From: Robert Olsson MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Message-ID: <15993.35552.587283.291895@robur.slu.se> Date: Thu, 20 Mar 2003 10:33:20 +0100 To: Archie Cobbs Cc: netdev@oss.sgi.com Subject: [PATCH] sk_buff's allocated from private pools In-Reply-To: <200303192208.h2JM8Yx1037110@bubba.precisionio.com> References: <200303192208.h2JM8Yx1037110@bubba.precisionio.com> X-Mailer: VM 6.92 under Emacs 19.34.1 X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) X-archive-position: 1985 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: Robert.Olsson@data.slu.se Precedence: bulk X-list: netdev Content-Length: 1496 Lines: 38 Archie Cobbs writes: > Hello, > > I'm submitting this patch for inclusion in the Linux kernel if deemed > generally useful. > > The purpose of this patch is to add a new function called alloc_skb_custom() > (or whatever) that allows the data portion of an sk_buff to reside in any > memory region, not just a region returned by kmalloc(). For example, if a > networking device has a restriction on where receive buffers may reside, > then the device driver can avoid copying every incoming packet if it is > able to create an sk_buff that points to the receive buffer memory. > > Basically this amounts to adding a 'free_data' function pointer to the > sk_buff structure. By default this points to kfree() but in general could > point to anywhere. FYI. The skb recycling patches I play with uses the same callback and has an implementation for private buffers and sync's with outstanding callback marked skb's etc. ftp://robur.slu.se/pub/Linux/net-development/skb_recycling/recycle19.pat ftp://robur.slu.se/pub/Linux/net-development/skb_recycling/e1000-RC-030217.pat Also for SMP it marks in skb header in which cpu skb_headerinit was done so callback has a chance to re-route skb to the origin CPU to minimize cache bouncing in case of recycling. Also skb_headerinit is moved to be the first operation the a skb life of skb not last. Current implementation uses only kmalloc for data part so your alloc_skb_custom add some new value. Cheers. --ro From hshmulik@intel.com Thu Mar 20 07:15:45 2003 Received: with ECARTIS (v1.0.0; list netdev); Thu, 20 Mar 2003 07:15:48 -0800 (PST) Received: from caduceus.fm.intel.com (fmr02.intel.com [192.55.52.25]) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h2KFFgq9024841 for ; Thu, 20 Mar 2003 07:15:43 -0800 Received: from talaria.fm.intel.com (talaria.fm.intel.com [10.1.192.39]) by caduceus.fm.intel.com (8.11.6/8.11.6/d: outer.mc,v 1.51 2002/09/23 20:43:23 dmccart Exp $) with ESMTP id h2KF96605132 for ; Thu, 20 Mar 2003 15:09:06 GMT Received: from fmsmsxv040-1.fm.intel.com (fmsmsxvs040.fm.intel.com [132.233.42.124]) by talaria.fm.intel.com (8.11.6/8.11.6/d: inner.mc,v 1.28 2003/01/13 19:44:39 dmccart Exp $) with SMTP id h2KFH5c07934 for ; Thu, 20 Mar 2003 15:17:05 GMT Received: from jrslxjul4.npdj.intel.com ([10.12.254.188]) by fmsmsxv040-1.fm.intel.com (NAVGW 2.5.2.11) with SMTP id M2003032007155221563 ; Thu, 20 Mar 2003 07:15:53 -0800 Date: Thu, 20 Mar 2003 17:15:36 +0200 (IST) From: Shmulik Hen X-X-Sender: hshmulik@jrslxjul4.npdj.intel.com To: Bonding Developement list , Bonding Announce list , Linux Net Mailing list , Linux Kernel Mailing list , Oss SGI Netdev list , Jeff Garzik Subject: [patch] (2/8) Add 802.3ad support to bonding Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) X-archive-position: 1990 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: hshmulik@intel.com Precedence: bulk X-list: netdev Content-Length: 5930 Lines: 192 This patch complements the latest drop of bonding from source-forge (2.4.20-20030317) by incorporating the changes to bond_release_all() too. It also fixes a hang when releasing a slave while outgoing traffic is running, that looks like a deadlock between the BR_NETPROTO_LOCK, dev->xmit_lock and the bond lock (happens on quad processor machines, but KDB back trace wasn't clear enough). This patch is against bonding 2.4.20-20030317. diff -Nuarp linux-2.4.20-bonding-20030317/drivers/net/bonding.c linux-2.4.20-bonding-20030317-devel/drivers/net/bonding.c --- linux-2.4.20-bonding-20030317/drivers/net/bonding.c 2003-03-18 17:03:24.000000000 +0200 +++ linux-2.4.20-bonding-20030317-devel/drivers/net/bonding.c 2003-03-18 17:03:24.000000000 +0200 @@ -286,6 +286,15 @@ * checking slave and slave->dev (which only worked by accident). * - Misc code cleanup: get arp_send() prototype from header file, * add max_bonds to bonding.txt. + * + * 2003/03/18 - Tsippy Mendelson and + * Shmulik Hen + * - Make sure only bond_attach_slave() and bond_detach_slave() can + * manipulate the slave list, including slave_cnt, even when in + * bond_release_all(). + * - Fixed hang in bond_release() while traffic is running. + * netdev_set_master() must not be called from within the bond lock. + * */ #include @@ -326,8 +335,8 @@ #include #include -#define DRV_VERSION "2.4.20-20030207" -#define DRV_RELDATE "February 7, 2003" +#define DRV_VERSION "2.4.20-20030317" +#define DRV_RELDATE "March 17, 2003" #define DRV_NAME "bonding" #define DRV_DESCRIPTION "Ethernet Channel Bonding Driver" @@ -1469,16 +1478,14 @@ static int bond_release(struct net_devic bond = (struct bonding *) master->priv; - write_lock_irqsave(&bond->lock, flags); - /* master already enslaved, or slave not enslaved, or no slave for this master */ if ((master->flags & IFF_SLAVE) || !(slave->flags & IFF_SLAVE)) { printk (KERN_DEBUG "%s: cannot release %s.\n", master->name, slave->name); - write_unlock_irqrestore(&bond->lock, flags); return -EINVAL; } + write_lock_irqsave(&bond->lock, flags); bond->current_arp_slave = NULL; our_slave = (slave_t *)bond; old_current = bond->current_slave; @@ -1497,38 +1504,7 @@ static int bond_release(struct net_devic } else { printk(".\n"); } - - /* release the slave from its bond */ - - if (multicast_mode == BOND_MULTICAST_ALL) { - /* flush master's mc_list from slave */ - bond_mc_list_flush (slave, master); - - /* unset promiscuity level from slave */ - if (master->flags & IFF_PROMISC) - dev_set_promiscuity(slave, -1); - - /* unset allmulti level from slave */ - if (master->flags & IFF_ALLMULTI) - dev_set_allmulti(slave, -1); - } - - netdev_set_master(slave, NULL); - - /* only restore its RUNNING flag if monitoring set it down */ - if (slave->flags & IFF_UP) { - slave->flags |= IFF_RUNNING; - } - - if (slave->flags & IFF_NOARP || - bond->current_slave != NULL) { - dev_close(slave); - our_slave->original_flags &= ~IFF_UP; - } - - bond_restore_slave_flags(our_slave); - kfree(our_slave); - + if (bond->current_slave == NULL) { printk(KERN_INFO "%s: now running without any active interface !\n", @@ -1539,16 +1515,51 @@ static int bond_release(struct net_devic bond->primary_slave = NULL; } - write_unlock_irqrestore(&bond->lock, flags); - return 0; /* deletion OK */ + break; } - } - /* if we get here, it's because the device was not found */ + } write_unlock_irqrestore(&bond->lock, flags); + + if (our_slave == (slave_t *)bond) { + /* if we get here, it's because the device was not found */ + printk (KERN_INFO "%s: %s not enslaved\n", master->name, slave->name); + return -EINVAL; + } + + /* undo settings and restore original values */ + + if (multicast_mode == BOND_MULTICAST_ALL) { + /* flush master's mc_list from slave */ + bond_mc_list_flush (slave, master); - printk (KERN_INFO "%s: %s not enslaved\n", master->name, slave->name); - return -EINVAL; + /* unset promiscuity level from slave */ + if (master->flags & IFF_PROMISC) + dev_set_promiscuity(slave, -1); + + /* unset allmulti level from slave */ + if (master->flags & IFF_ALLMULTI) + dev_set_allmulti(slave, -1); + } + + netdev_set_master(slave, NULL); + + /* only restore its RUNNING flag if monitoring set it down */ + if (slave->flags & IFF_UP) { + slave->flags |= IFF_RUNNING; + } + + if (slave->flags & IFF_NOARP || + bond->current_slave != NULL) { + dev_close(slave); + our_slave->original_flags &= ~IFF_UP; + } + + bond_restore_slave_flags(our_slave); + + kfree(our_slave); + + return 0; /* deletion OK */ } /* @@ -1571,10 +1582,12 @@ static int bond_release_all(struct net_d bond = (struct bonding *) master->priv; bond->current_arp_slave = NULL; + bond->current_slave = NULL; + bond->primary_slave = NULL; while ((our_slave = bond->prev) != (slave_t *)bond) { slave_dev = our_slave->dev; - bond->prev = our_slave->prev; + bond_detach_slave(bond, our_slave); if (multicast_mode == BOND_MULTICAST_ALL || (multicast_mode == BOND_MULTICAST_ACTIVE @@ -1604,10 +1617,6 @@ static int bond_release_all(struct net_d dev_close(slave_dev); } - bond->current_slave = NULL; - bond->next = (slave_t *)bond; - bond->slave_cnt = 0; - bond->primary_slave = NULL; printk (KERN_INFO "%s: released all slaves\n", master->name); return 0; -- | Shmulik Hen | | Israel Design Center (Jerusalem) | | LAN Access Division | | Intel Communications Group, Intel corp. | | | | Anti-Spam: shmulik dot hen at intel dot com | From hshmulik@intel.com Thu Mar 20 07:15:14 2003 Received: with ECARTIS (v1.0.0; list netdev); Thu, 20 Mar 2003 07:15:17 -0800 (PST) Received: from hermes.fm.intel.com (fmr01.intel.com [192.55.52.18]) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h2KFFEq9024645 for ; Thu, 20 Mar 2003 07:15:14 -0800 Received: from talaria.fm.intel.com (talaria.fm.intel.com [10.1.192.39]) by hermes.fm.intel.com (8.11.6/8.11.6/d: outer.mc,v 1.51 2002/09/23 20:43:23 dmccart Exp $) with ESMTP id h2KFBif07915 for ; Thu, 20 Mar 2003 15:11:45 GMT Received: from fmsmsxv040-1.fm.intel.com (fmsmsxvs040.fm.intel.com [132.233.42.124]) by talaria.fm.intel.com (8.11.6/8.11.6/d: inner.mc,v 1.28 2003/01/13 19:44:39 dmccart Exp $) with SMTP id h2KFGac07600 for ; Thu, 20 Mar 2003 15:16:36 GMT Received: from jrslxjul4.npdj.intel.com ([10.12.254.188]) by fmsmsxv040-1.fm.intel.com (NAVGW 2.5.2.11) with SMTP id M2003032007152306522 ; Thu, 20 Mar 2003 07:15:25 -0800 Date: Thu, 20 Mar 2003 17:15:07 +0200 (IST) From: Shmulik Hen X-X-Sender: hshmulik@jrslxjul4.npdj.intel.com To: Bonding Developement list , Bonding Announce list , Linux Net Mailing list , Linux Kernel Mailing list , Oss SGI Netdev list , Jeff Garzik Subject: [patch] (1/8) Adding 802.3ad support to bonding Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) X-archive-position: 1989 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: hshmulik@intel.com Precedence: bulk X-list: netdev Content-Length: 3792 Lines: 100 This patch adds support for point to point protocols (e.g. 802.3ad) over bonding that need to know the physical device the skb came on. It saves the real device in a new field in skbuff before overwriting it with the virtual interface device in skb_bond() and __vlan_hwaccel_rx(). This patch is against 2.4.20 kernel and gives compatibility for anyone who wants to test the latest release of bonding from source-forge (2.4.20-20030317). diff -Nuarp linux-2.4.20-orig/include/linux/if_vlan.h linux-2.4.20-devel/include/linux/if_vlan.h --- linux-2.4.20-orig/include/linux/if_vlan.h 2002-11-29 01:53:15.000000000 +0200 +++ linux-2.4.20-devel/include/linux/if_vlan.h 2003-03-04 12:13:23.000000000 +0200 @@ -148,6 +148,9 @@ static inline int __vlan_hwaccel_rx(stru { struct net_device_stats *stats; +#ifdef BOND_POINT_TO_POINT_PROT + skb->real_dev = skb->dev; +#endif //BOND_POINT_TO_POINT_PROT skb->dev = grp->vlan_devices[vlan_tag & VLAN_VID_MASK]; if (skb->dev == NULL) { kfree_skb(skb); diff -Nuarp linux-2.4.20-orig/include/linux/skbuff.h linux-2.4.20-devel/include/linux/skbuff.h --- linux-2.4.20-orig/include/linux/skbuff.h 2002-08-03 03:39:46.000000000 +0300 +++ linux-2.4.20-devel/include/linux/skbuff.h 2003-03-04 11:59:29.000000000 +0200 @@ -135,6 +135,11 @@ struct sk_buff { struct sock *sk; /* Socket we are owned by */ struct timeval stamp; /* Time we arrived */ struct net_device *dev; /* Device we arrived on/are leaving by */ +#define BOND_POINT_TO_POINT_PROT + struct net_device *real_dev; /* For support of point to point protocols + (e.g. 802.3ad) over bonding, we must save the + physical device that got the packet before + replacing skb->dev with the virtual device. */ /* Transport layer header */ union diff -Nuarp linux-2.4.20-orig/net/core/dev.c linux-2.4.20-devel/net/core/dev.c --- linux-2.4.20-orig/net/core/dev.c 2002-11-29 01:53:15.000000000 +0200 +++ linux-2.4.20-devel/net/core/dev.c 2003-03-03 19:48:15.000000000 +0200 @@ -1328,8 +1328,12 @@ static __inline__ void skb_bond(struct s { struct net_device *dev = skb->dev; - if (dev->master) - skb->dev = dev->master; + if (dev->master) { +#ifdef BOND_POINT_TO_POINT_PROT + skb->real_dev = skb->dev; +#endif //BOND_POINT_TO_POINT_PROT + skb->dev = dev->master; + } } static void net_tx_action(struct softirq_action *h) diff -Nuarp linux-2.4.20-orig/net/core/skbuff.c linux-2.4.20-devel/net/core/skbuff.c --- linux-2.4.20-orig/net/core/skbuff.c 2002-08-03 03:39:46.000000000 +0300 +++ linux-2.4.20-devel/net/core/skbuff.c 2003-03-03 19:51:39.000000000 +0200 @@ -231,6 +231,9 @@ static inline void skb_headerinit(void * skb->sk = NULL; skb->stamp.tv_sec=0; /* No idea about time */ skb->dev = NULL; +#ifdef BOND_POINT_TO_POINT_PROT + skb->real_dev = NULL; +#endif //BOND_POINT_TO_POINT_PROT skb->dst = NULL; memset(skb->cb, 0, sizeof(skb->cb)); skb->pkt_type = PACKET_HOST; /* Default type */ @@ -362,6 +365,9 @@ struct sk_buff *skb_clone(struct sk_buff n->sk = NULL; C(stamp); C(dev); +#ifdef BOND_POINT_TO_POINT_PROT + C(real_dev); +#endif //BOND_POINT_TO_POINT_PROT C(h); C(nh); C(mac); @@ -417,6 +423,9 @@ static void copy_skb_header(struct sk_bu new->list=NULL; new->sk=NULL; new->dev=old->dev; +#ifdef BOND_POINT_TO_POINT_PROT + new->real_dev=old->real_dev; +#endif //BOND_POINT_TO_POINT_PROT new->priority=old->priority; new->protocol=old->protocol; new->dst=dst_clone(old->dst); -- | Shmulik Hen | | Israel Design Center (Jerusalem) | | LAN Access Division | | Intel Communications Group, Intel corp. | | | | Anti-Spam: shmulik dot hen at intel dot com | From hshmulik@intel.com Thu Mar 20 07:14:34 2003 Received: with ECARTIS (v1.0.0; list netdev); Thu, 20 Mar 2003 07:14:38 -0800 (PST) Received: from hermes.fm.intel.com (fmr01.intel.com [192.55.52.18]) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h2KFEWq9024559 for ; Thu, 20 Mar 2003 07:14:33 -0800 Received: from talaria.fm.intel.com (talaria.fm.intel.com [10.1.192.39]) by hermes.fm.intel.com (8.11.6/8.11.6/d: outer.mc,v 1.51 2002/09/23 20:43:23 dmccart Exp $) with ESMTP id h2KFAkf07558 for ; Thu, 20 Mar 2003 15:10:55 GMT Received: from fmsmsxv040-1.fm.intel.com (fmsmsxvs040.fm.intel.com [132.233.42.124]) by talaria.fm.intel.com (8.11.6/8.11.6/d: inner.mc,v 1.28 2003/01/13 19:44:39 dmccart Exp $) with SMTP id h2KFFcc06713 for ; Thu, 20 Mar 2003 15:15:38 GMT Received: from jrslxjul4.npdj.intel.com ([10.12.254.188]) by fmsmsxv040-1.fm.intel.com (NAVGW 2.5.2.11) with SMTP id M2003032007142513089 ; Thu, 20 Mar 2003 07:14:27 -0800 Date: Thu, 20 Mar 2003 17:14:09 +0200 (IST) From: Shmulik Hen X-X-Sender: hshmulik@jrslxjul4.npdj.intel.com To: Bonding Developement list , Bonding Announce list , Linux Net Mailing list , Linux Kernel Mailing list , Oss SGI Netdev list , Jeff Garzik Subject: [patch] (0/8) Adding 802.3ad support to bonding Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) X-archive-position: 1988 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: hshmulik@intel.com Precedence: bulk X-list: netdev Content-Length: 3746 Lines: 102 This patch adds support for point to point protocols (e.g. 802.3ad) over bonding that need to know the physical device the skb came on. It saves the real device in a new field in skbuff before overwriting it with the virtual interface device in skb_bond() and __vlan_hwaccel_rx(). This patch is against 2.4.21-pre5 kernel. diff -Nuarp linux-2.4.21-pre5-orig/include/linux/if_vlan.h linux-2.4.21-pre5-new/include/linux/if_vlan.h --- linux-2.4.21-pre5-orig/include/linux/if_vlan.h 2002-11-29 01:53:15.000000000 +0200 +++ linux-2.4.21-pre5-new/include/linux/if_vlan.h 2003-03-04 14:01:56.000000000 +0200 @@ -148,6 +148,9 @@ static inline int __vlan_hwaccel_rx(stru { struct net_device_stats *stats; +#ifdef BOND_POINT_TO_POINT_PROT + skb->real_dev = skb->dev; +#endif //BOND_POINT_TO_POINT_PROT skb->dev = grp->vlan_devices[vlan_tag & VLAN_VID_MASK]; if (skb->dev == NULL) { kfree_skb(skb); diff -Nuarp linux-2.4.21-pre5-orig/include/linux/skbuff.h linux-2.4.21-pre5-new/include/linux/skbuff.h --- linux-2.4.21-pre5-orig/include/linux/skbuff.h 2003-03-04 13:43:27.000000000 +0200 +++ linux-2.4.21-pre5-new/include/linux/skbuff.h 2003-03-04 14:13:25.000000000 +0200 @@ -135,6 +135,11 @@ struct sk_buff { struct sock *sk; /* Socket we are owned by */ struct timeval stamp; /* Time we arrived */ struct net_device *dev; /* Device we arrived on/are leaving by */ +#define BOND_POINT_TO_POINT_PROT + struct net_device *real_dev; /* For support of point to point protocols + (e.g. 802.3ad) over bonding, we must save the + physical device that got the packet before + replacing skb->dev with the virtual device. */ /* Transport layer header */ union diff -Nuarp linux-2.4.21-pre5-orig/net/core/dev.c linux-2.4.21-pre5-new/net/core/dev.c --- linux-2.4.21-pre5-orig/net/core/dev.c 2003-03-04 13:43:28.000000000 +0200 +++ linux-2.4.21-pre5-new/net/core/dev.c 2003-03-04 14:14:56.000000000 +0200 @@ -1328,8 +1328,12 @@ static __inline__ void skb_bond(struct s { struct net_device *dev = skb->dev; - if (dev->master) - skb->dev = dev->master; + if (dev->master) { +#ifdef BOND_POINT_TO_POINT_PROT + skb->real_dev = skb->dev; +#endif //BOND_POINT_TO_POINT_PROT + skb->dev = dev->master; + } } static void net_tx_action(struct softirq_action *h) diff -Nuarp linux-2.4.21-pre5-orig/net/core/skbuff.c linux-2.4.21-pre5-new/net/core/skbuff.c --- linux-2.4.21-pre5-orig/net/core/skbuff.c 2003-03-04 13:43:28.000000000 +0200 +++ linux-2.4.21-pre5-new/net/core/skbuff.c 2003-03-04 14:17:44.000000000 +0200 @@ -231,6 +231,9 @@ static inline void skb_headerinit(void * skb->sk = NULL; skb->stamp.tv_sec=0; /* No idea about time */ skb->dev = NULL; +#ifdef BOND_POINT_TO_POINT_PROT + skb->real_dev = NULL; +#endif //BOND_POINT_TO_POINT_PROT skb->dst = NULL; memset(skb->cb, 0, sizeof(skb->cb)); skb->pkt_type = PACKET_HOST; /* Default type */ @@ -362,6 +365,9 @@ struct sk_buff *skb_clone(struct sk_buff n->sk = NULL; C(stamp); C(dev); +#ifdef BOND_POINT_TO_POINT_PROT + C(real_dev); +#endif //BOND_POINT_TO_POINT_PROT C(h); C(nh); C(mac); @@ -417,6 +423,9 @@ static void copy_skb_header(struct sk_bu new->list=NULL; new->sk=NULL; new->dev=old->dev; +#ifdef BOND_POINT_TO_POINT_PROT + new->real_dev=old->real_dev; +#endif //BOND_POINT_TO_POINT_PROT new->priority=old->priority; new->protocol=old->protocol; new->dst=dst_clone(old->dst); -- | Shmulik Hen | | Israel Design Center (Jerusalem) | | LAN Access Division | | Intel Communications Group, Intel corp. | | | | Anti-Spam: shmulik dot hen at intel dot com | From hshmulik@intel.com Thu Mar 20 07:16:14 2003 Received: with ECARTIS (v1.0.0; list netdev); Thu, 20 Mar 2003 07:16:21 -0800 (PST) Received: from caduceus.fm.intel.com (fmr02.intel.com [192.55.52.25]) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h2KFGEq9025187 for ; Thu, 20 Mar 2003 07:16:14 -0800 Received: from talaria.fm.intel.com (talaria.fm.intel.com [10.1.192.39]) by caduceus.fm.intel.com (8.11.6/8.11.6/d: outer.mc,v 1.51 2002/09/23 20:43:23 dmccart Exp $) with ESMTP id h2KF9b605562 for ; Thu, 20 Mar 2003 15:09:37 GMT Received: from fmsmsxv040-1.fm.intel.com (fmsmsxvs040.fm.intel.com [132.233.42.124]) by talaria.fm.intel.com (8.11.6/8.11.6/d: inner.mc,v 1.28 2003/01/13 19:44:39 dmccart Exp $) with SMTP id h2KFHUc08232 for ; Thu, 20 Mar 2003 15:17:30 GMT Received: from jrslxjul4.npdj.intel.com ([10.12.254.188]) by fmsmsxv040-1.fm.intel.com (NAVGW 2.5.2.11) with SMTP id M2003032007161716904 ; Thu, 20 Mar 2003 07:16:19 -0800 Date: Thu, 20 Mar 2003 17:16:01 +0200 (IST) From: Shmulik Hen X-X-Sender: hshmulik@jrslxjul4.npdj.intel.com To: Bonding Developement list , Bonding Announce list , Linux Net Mailing list , Linux Kernel Mailing list , Oss SGI Netdev list , Jeff Garzik Subject: [patch] (3/8) Add 802.3ad support to bonding Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) X-archive-position: 1991 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: hshmulik@intel.com Precedence: bulk X-list: netdev Content-Length: 3378 Lines: 103 This patch fixes a hang when enslaving a new slave while incoming traffic is running, that looks like a deadlock between the BR_NETPROTO_LOCK, dev->xmit_lock and the bond lock (happens on quad processor machines, but KDB back trace wasn't clear enough). This patch is against bonding 2.4.20-20030317. diff -Nuarp linux-2.4.20-bonding-20030317/drivers/net/bonding.c linux-2.4.20-bonding-20030317-devel/drivers/net/bonding.c --- linux-2.4.20-bonding-20030317/drivers/net/bonding.c 2003-03-18 17:03:25.000000000 +0200 +++ linux-2.4.20-bonding-20030317-devel/drivers/net/bonding.c 2003-03-18 17:03:26.000000000 +0200 @@ -295,6 +295,10 @@ * - Fixed hang in bond_release() while traffic is running. * netdev_set_master() must not be called from within the bond lock. * + * 2003/03/18 - Tsippy Mendelson and + * Shmulik Hen + * - Fixed hang in bond_enslave(): netdev_set_master() must not be + * called from within the bond lock while traffic is running. */ #include @@ -1066,14 +1070,12 @@ static int bond_enslave(struct net_devic "Warning : no link monitoring support for %s\n", slave_dev->name); } - write_lock_irqsave(&bond->lock, flags); /* not running. */ if ((slave_dev->flags & IFF_UP) != IFF_UP) { #ifdef BONDING_DEBUG printk(KERN_CRIT "Error, slave_dev is not running\n"); #endif - write_unlock_irqrestore(&bond->lock, flags); return -EINVAL; } @@ -1082,12 +1084,10 @@ static int bond_enslave(struct net_devic #ifdef BONDING_DEBUG printk(KERN_CRIT "Error, Device was already enslaved\n"); #endif - write_unlock_irqrestore(&bond->lock, flags); return -EBUSY; } if ((new_slave = kmalloc(sizeof(slave_t), GFP_ATOMIC)) == NULL) { - write_unlock_irqrestore(&bond->lock, flags); return -ENOMEM; } memset(new_slave, 0, sizeof(slave_t)); @@ -1100,9 +1100,7 @@ static int bond_enslave(struct net_devic #ifdef BONDING_DEBUG printk(KERN_CRIT "Error %d calling netdev_set_master\n", err); #endif - kfree(new_slave); - write_unlock_irqrestore(&bond->lock, flags); - return err; + goto err_free; } new_slave->dev = slave_dev; @@ -1121,6 +1119,8 @@ static int bond_enslave(struct net_devic dev_mc_add (slave_dev, dmi->dmi_addr, dmi->dmi_addrlen, 0); } + write_lock_irqsave(&bond->lock, flags); + bond_attach_slave(bond, new_slave); new_slave->delay = 0; new_slave->link_failure_count = 0; @@ -1259,7 +1259,11 @@ static int bond_enslave(struct net_devic new_slave->state == BOND_STATE_ACTIVE ? "n active" : " backup", new_slave->link == BOND_LINK_UP ? "n up" : " down"); + //enslave is successfull return 0; +err_free: + kfree(new_slave); + return err; } /* @@ -1607,6 +1611,9 @@ static int bond_release_all(struct net_d kfree(our_slave); + /* Can be safely called from inside the bond lock + since traffic and timers have already stopped + */ netdev_set_master(slave_dev, NULL); /* only restore its RUNNING flag if monitoring set it down */ -- | Shmulik Hen | | Israel Design Center (Jerusalem) | | LAN Access Division | | Intel Communications Group, Intel corp. | | | | Anti-Spam: shmulik dot hen at intel dot com | From hshmulik@intel.com Thu Mar 20 07:16:34 2003 Received: with ECARTIS (v1.0.0; list netdev); Thu, 20 Mar 2003 07:16:37 -0800 (PST) Received: from hermes.fm.intel.com (fmr01.intel.com [192.55.52.18]) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h2KFGXq9025338 for ; Thu, 20 Mar 2003 07:16:33 -0800 Received: from talaria.fm.intel.com (talaria.fm.intel.com [10.1.192.39]) by hermes.fm.intel.com (8.11.6/8.11.6/d: outer.mc,v 1.51 2002/09/23 20:43:23 dmccart Exp $) with ESMTP id h2KFD2f09202 for ; Thu, 20 Mar 2003 15:13:03 GMT Received: from fmsmsxv040-1.fm.intel.com (fmsmsxvs040.fm.intel.com [132.233.42.124]) by talaria.fm.intel.com (8.11.6/8.11.6/d: inner.mc,v 1.28 2003/01/13 19:44:39 dmccart Exp $) with SMTP id h2KFHsc08500 for ; Thu, 20 Mar 2003 15:17:54 GMT Received: from jrslxjul4.npdj.intel.com ([10.12.254.188]) by fmsmsxv040-1.fm.intel.com (NAVGW 2.5.2.11) with SMTP id M2003032007164119211 ; Thu, 20 Mar 2003 07:16:43 -0800 Date: Thu, 20 Mar 2003 17:16:25 +0200 (IST) From: Shmulik Hen X-X-Sender: hshmulik@jrslxjul4.npdj.intel.com To: Bonding Developement list , Bonding Announce list , Linux Net Mailing list , Linux Kernel Mailing list , Oss SGI Netdev list , Jeff Garzik Subject: [patch] (4/8) Add 802.3ad support to bonding Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) X-archive-position: 1992 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: hshmulik@intel.com Precedence: bulk X-list: netdev Content-Length: 4132 Lines: 136 This patch adds support for getting slave's speed and duplex via ethtool (Needed for 802.3ad and other future modes). This patch is against bonding 2.4.20-20030317. diff -Nuarp linux-2.4.20-bonding-20030317/drivers/net/bonding.c linux-2.4.20-bonding-20030317-devel/drivers/net/bonding.c --- linux-2.4.20-bonding-20030317/drivers/net/bonding.c 2003-03-18 17:03:26.000000000 +0200 +++ linux-2.4.20-bonding-20030317-devel/drivers/net/bonding.c 2003-03-18 17:03:27.000000000 +0200 @@ -299,6 +299,10 @@ * Shmulik Hen * - Fixed hang in bond_enslave(): netdev_set_master() must not be * called from within the bond lock while traffic is running. + * + * 2003/03/18 - Amir Noam + * - Added support for getting slave's speed and duplex via ethtool. + * Needed for 802.3ad and other future modes. */ #include @@ -649,6 +653,59 @@ bond_attach_slave(struct bonding *bond, set_fs(fs); \ ret; }) +/* + * Get link speed and duplex from the slave's base driver + * using ethtool. If for some reason the call fails or the + * values are invalid, fake speed and duplex to 100/Full + * and return error. + */ +static int bond_update_speed_duplex(struct slave *slave) +{ + struct net_device *dev = slave->dev; + static int (* ioctl)(struct net_device *, struct ifreq *, int); + struct ifreq ifr; + struct ethtool_cmd etool; + + ioctl = dev->do_ioctl; + if (ioctl) { + etool.cmd = ETHTOOL_GSET; + ifr.ifr_data = (char*)&etool; + if (IOCTL(dev, &ifr, SIOCETHTOOL) == 0) { + slave->speed = etool.speed; + slave->duplex = etool.duplex; + } else { + goto err_out; + } + } else { + goto err_out; + } + + switch (slave->speed) { + case SPEED_10: + case SPEED_100: + case SPEED_1000: + break; + default: + goto err_out; + } + + switch (slave->duplex) { + case DUPLEX_FULL: + case DUPLEX_HALF: + break; + default: + goto err_out; + } + + return 0; + +err_out: + //Fake speed and duplex + slave->speed = SPEED_100; + slave->duplex = DUPLEX_FULL; + return -1; +} + /* * if supports MII link status reporting, check its link status. * @@ -1173,6 +1230,13 @@ static int bond_enslave(struct net_devic new_slave->link = BOND_LINK_DOWN; } + if (bond_update_speed_duplex(new_slave) && (new_slave->link == BOND_LINK_UP) ) { + printk(KERN_WARNING + "bond_enslave(): failed to get speed/duplex from %s, " + "speed forced to 100Mbps, duplex forced to Full.\n", + new_slave->dev->name); + } + /* if we're in active-backup mode, we need one and only one active * interface. The backup interfaces will have their NOARP flag set * because we need them to be completely deaf and not to respond to @@ -1821,6 +1885,9 @@ static void bond_mii_monitor(struct net_ } break; } /* end of switch */ + + bond_update_speed_duplex(slave); + } /* end of while */ /* diff -Nuarp linux-2.4.20-bonding-20030317/include/linux/if_bonding.h linux-2.4.20-bonding-20030317-devel/include/linux/if_bonding.h --- linux-2.4.20-bonding-20030317/include/linux/if_bonding.h 2003-03-18 17:03:26.000000000 +0200 +++ linux-2.4.20-bonding-20030317-devel/include/linux/if_bonding.h 2003-03-18 17:03:27.000000000 +0200 @@ -11,6 +11,9 @@ * This software may be used and distributed according to the terms * of the GNU Public License, incorporated herein by reference. * + * 2003/03/18 - Amir Noam + * - Added support for getting slave's speed and duplex via ethtool. + * Needed for 802.3ad and other future modes. */ #ifndef _LINUX_IF_BONDING_H @@ -89,6 +92,8 @@ typedef struct slave { char state; /* one of BOND_STATE_XXXX */ unsigned short original_flags; u32 link_failure_count; + u16 speed; + u8 duplex; } slave_t; /* -- | Shmulik Hen | | Israel Design Center (Jerusalem) | | LAN Access Division | | Intel Communications Group, Intel corp. | | | | Anti-Spam: shmulik dot hen at intel dot com | From hshmulik@intel.com Thu Mar 20 07:17:22 2003 Received: with ECARTIS (v1.0.0; list netdev); Thu, 20 Mar 2003 07:17:26 -0800 (PST) Received: from caduceus.fm.intel.com (fmr02.intel.com [192.55.52.25]) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h2KFHIq9026084 for ; Thu, 20 Mar 2003 07:17:19 -0800 Received: from talaria.fm.intel.com (talaria.fm.intel.com [10.1.192.39]) by caduceus.fm.intel.com (8.11.6/8.11.6/d: outer.mc,v 1.51 2002/09/23 20:43:23 dmccart Exp $) with ESMTP id h2KFAc606657 for ; Thu, 20 Mar 2003 15:10:39 GMT Received: from fmsmsxv040-1.fm.intel.com (fmsmsxvs040.fm.intel.com [132.233.42.124]) by talaria.fm.intel.com (8.11.6/8.11.6/d: inner.mc,v 1.28 2003/01/13 19:44:39 dmccart Exp $) with SMTP id h2KFIac08996 for ; Thu, 20 Mar 2003 15:18:36 GMT Received: from jrslxjul4.npdj.intel.com ([10.12.254.188]) by fmsmsxv040-1.fm.intel.com (NAVGW 2.5.2.11) with SMTP id M2003032007172214967 ; Thu, 20 Mar 2003 07:17:24 -0800 Date: Thu, 20 Mar 2003 17:17:07 +0200 (IST) From: Shmulik Hen X-X-Sender: hshmulik@jrslxjul4.npdj.intel.com To: Bonding Developement list , Bonding Announce list , Linux Net Mailing list , Linux Kernel Mailing list , Oss SGI Netdev list , Jeff Garzik Subject: [patch] (5/8) Add 802.3ad support to bonding Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) X-archive-position: 1993 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: hshmulik@intel.com Precedence: bulk X-list: netdev Content-Length: 17875 Lines: 511 This patch enables support of modes that need to use the unique mac address of each slave. It moves setting the slave's mac address and opening it from the application to the driver. This breaks backward compatibility between the new driver and older applications ! It also blocks possibility of enslaving before the master is up (to prevent putting the system in an unstable state), and removes the code that unconditionally restores all base driver's flags (flags are automatically restored once all undo stages are done in proper order). This patch is against bonding 2.4.20-20030317. diff -Nuarp linux-2.4.20-bonding-20030317/Documentation/networking/ifenslave.c linux-2.4.20-bonding-20030317-devel/Documentation/networking/ifenslave.c --- linux-2.4.20-bonding-20030317/Documentation/networking/ifenslave.c 2003-03-18 17:03:28.000000000 +0200 +++ linux-2.4.20-bonding-20030317-devel/Documentation/networking/ifenslave.c 2003-03-18 17:03:28.000000000 +0200 @@ -51,6 +51,15 @@ * multiple interfaces are specified on a single ifenslave command * (ifenslave bond0 eth0 eth1). * + * - 2003/03/18 - Tsippy Mendelson and + * Shmulik Hen + * - Moved setting the slave's mac address and openning it, from + * the application to the driver. This enables support of modes + * that need to use the unique mac address of each slave. + * The driver also takes care of closing the slave and restoring its + * original mac address upon release. + * In addition, block possibility of enslaving before the master is up. + * This prevents putting the system in an undefined state. */ static char *version = @@ -278,30 +287,11 @@ main(int argc, char **argv) fprintf(stderr, "SIOCBONDRELEASE: cannot detach %s from %s. errno=%s.\n", slave_ifname, master_ifname, strerror(errno)); } - else { /* we'll set the interface down to avoid any conflicts due to - same IP/MAC */ - strncpy(ifr2.ifr_name, slave_ifname, IFNAMSIZ); - if (ioctl(skfd, SIOCGIFFLAGS, &ifr2) < 0) { - int saved_errno = errno; - fprintf(stderr, "SIOCGIFFLAGS on %s failed: %s\n", slave_ifname, - strerror(saved_errno)); - } - else { - ifr2.ifr_flags &= ~(IFF_UP | IFF_RUNNING); - if (ioctl(skfd, SIOCSIFFLAGS, &ifr2) < 0) { - int saved_errno = errno; - fprintf(stderr, "Shutting down interface %s failed: %s\n", - slave_ifname, strerror(saved_errno)); - } - } - } + /* the bonding module takes care of restoring the slaves original + * mac address and closing its net device + */ } else { /* attach a slave interface to the master */ - /* two possibilities : - - if hwaddr_notset, do nothing. The bond will assign the - hwaddr from it's first slave. - - if !hwaddr_notset, assign the master's hwaddr to each slave - */ strncpy(ifr2.ifr_name, slave_ifname, IFNAMSIZ); if (ioctl(skfd, SIOCGIFFLAGS, &ifr2) < 0) { @@ -311,6 +301,7 @@ main(int argc, char **argv) return 1; } + /* if hwaddr_notset, assign the slave hw address to the master */ if (hwaddr_notset) { /* assign the slave hw address to the * master since it currently does not @@ -341,6 +332,10 @@ main(int argc, char **argv) */ master_up = 1; } + } else { + fprintf(stderr, "Cannot enslave; the specified master interface '%s' is not up.\n", master_ifname); + + exit (1); } if (!goterr) { @@ -389,41 +384,10 @@ main(int argc, char **argv) } } - } else { - /* we'll assign master's hwaddr to this slave */ - if (ifr2.ifr_flags & IFF_UP) { - ifr2.ifr_flags &= ~IFF_UP; - if (ioctl(skfd, SIOCSIFFLAGS, &ifr2) < 0) { - int saved_errno = errno; - fprintf(stderr, "Shutting down interface %s failed: %s\n", - slave_ifname, strerror(saved_errno)); - } - } - - strncpy(if_hwaddr.ifr_name, slave_ifname, IFNAMSIZ); - if (ioctl(skfd, SIOCSIFHWADDR, &if_hwaddr) < 0) { - int saved_errno = errno; - fprintf(stderr, "SIOCSIFHWADDR on %s failed: %s\n", if_hwaddr.ifr_name, - strerror(saved_errno)); - if (saved_errno == EBUSY) - fprintf(stderr, " The slave device %s is busy: it must be" - " idle before running this command.\n", slave_ifname); - else if (saved_errno == EOPNOTSUPP) - fprintf(stderr, " The slave device you specified does not support" - " setting the MAC address.\n Your kernel likely does not" - " support slave devices.\n"); - else if (saved_errno == EINVAL) - fprintf(stderr, " The slave device's address type does not match" - " the master's address type.\n"); - } else { - if (verbose) { - unsigned char *hwaddr = if_hwaddr.ifr_hwaddr.sa_data; - printf("Slave's (%s) hardware address set to " - "%2.2x:%2.2x:%2.2x:%2.2x:%2.2x:%2.2x.\n", slave_ifname, - hwaddr[0], hwaddr[1], hwaddr[2], hwaddr[3], hwaddr[4], hwaddr[5]); - } - } } + /* the bonding module takes care of setting the slave's mac address + * according to the mode requirements. + */ if (*spp && !strcmp(*spp, "metric")) { if (*++spp == NULL) { @@ -500,18 +464,18 @@ main(int argc, char **argv) } } - ifr2.ifr_flags |= IFF_UP; /* the interface will need to be up to be bonded */ - if ((ifr2.ifr_flags &= ~(IFF_SLAVE | IFF_MASTER)) == 0 - || strncpy(ifr2.ifr_name, slave_ifname, IFNAMSIZ) <= 0 - || ioctl(skfd, SIOCSIFFLAGS, &ifr2) < 0) { - fprintf(stderr, - "Something broke setting the slave (%s) flags: %s.\n", - slave_ifname, strerror(errno)); - } else { - if (verbose) - printf("Set the slave's (%s) flags %4.4x.\n", slave_ifname, if_flags.ifr_flags); + /* the bonding module takes care of openning the interface + * after setting its mac address + */ + if (ifr2.ifr_flags & IFF_UP) { // the interface will need to be down + ifr2.ifr_flags &= ~IFF_UP; + if (ioctl(skfd, SIOCSIFFLAGS, &ifr2) < 0) { + int saved_errno = errno; + fprintf(stderr, "Shutting down interface %s failed: %s\n", + slave_ifname, strerror(saved_errno)); + } } - + /* Do the real thing */ if ( ! opt_r) { strncpy(if_flags.ifr_name, master_ifname, IFNAMSIZ); diff -Nuarp linux-2.4.20-bonding-20030317/drivers/net/bonding.c linux-2.4.20-bonding-20030317-devel/drivers/net/bonding.c --- linux-2.4.20-bonding-20030317/drivers/net/bonding.c 2003-03-18 17:03:28.000000000 +0200 +++ linux-2.4.20-bonding-20030317-devel/drivers/net/bonding.c 2003-03-18 17:03:28.000000000 +0200 @@ -303,6 +303,22 @@ * 2003/03/18 - Amir Noam * - Added support for getting slave's speed and duplex via ethtool. * Needed for 802.3ad and other future modes. + * + * 2003/03/18 - Tsippy Mendelson and + * Shmulik Hen + * - Enable support of modes that need to use the unique mac address of + * each slave. + * * bond_enslave(): Moved setting the slave's mac address, and + * openning it, from the application to the driver. This breaks + * backward comaptibility with old versions of ifenslave that open + * the slave before enalsving it !!!. + * * bond_release(): The driver also takes care of closing the slave + * and restoring its original mac address. + * - Removed the code that restores all base driver's flags. + * Flags are automatically restored once all undo stages are done + * properly. + * - Block possibility of enslaving before the master is up. This + * prevents putting the system in an unstable state. */ #include @@ -433,7 +449,6 @@ static void bond_mii_monitor(struct net_ static void loadbalance_arp_monitor(struct net_device *dev); static void activebackup_arp_monitor(struct net_device *dev); static int bond_event(struct notifier_block *this, unsigned long event, void *ptr); -static void bond_restore_slave_flags(slave_t *slave); static void bond_mc_list_destroy(struct bonding *bond); static void bond_mc_add(bonding_t *bond, void *addr, int alen); static void bond_mc_delete(bonding_t *bond, void *addr, int alen); @@ -509,11 +524,6 @@ multicast_mode_name(void) } } -static void bond_restore_slave_flags(slave_t *slave) -{ - slave->dev->flags = slave->original_flags; -} - static void bond_set_slave_inactive_flags(slave_t *slave) { slave->state = BOND_STATE_BACKUP; @@ -1110,12 +1120,12 @@ static int bond_enslave(struct net_devic slave_t *new_slave = NULL; unsigned long flags = 0; unsigned long rflags = 0; - int ndx = 0; int err = 0; struct dev_mc_list *dmi; struct in_ifaddr **ifap; struct in_ifaddr *ifa; int link_reporting; + struct sockaddr addr; if (master_dev == NULL || slave_dev == NULL) { return -ENODEV; @@ -1128,12 +1138,14 @@ static int bond_enslave(struct net_devic slave_dev->name); } - /* not running. */ - if ((slave_dev->flags & IFF_UP) != IFF_UP) { + /* This breaks backward comaptibility with old versions + of ifenslave which open the slave before enalsving */ + /* already up. */ + if ((slave_dev->flags & IFF_UP) == IFF_UP) { #ifdef BONDING_DEBUG - printk(KERN_CRIT "Error, slave_dev is not running\n"); + printk(KERN_CRIT "Error, slave_dev is up\n"); #endif - return -EINVAL; + return -EBUSY; } /* already enslaved */ @@ -1144,20 +1156,66 @@ static int bond_enslave(struct net_devic return -EBUSY; } + /* bond must be initialize by bond_open() before enslaving */ + if ((master_dev->flags & IFF_UP) != IFF_UP) { +#ifdef BONDING_DEBUG + printk(KERN_CRIT "Error, master_dev is not up\n"); +#endif + return -EPERM; + } + + if (slave_dev->set_mac_address == NULL) { + printk(KERN_CRIT " The slave device you specified does not support" + " setting the MAC address.\n Your kernel likely does not" + " support slave devices.\n"); + return -EOPNOTSUPP; + } + if ((new_slave = kmalloc(sizeof(slave_t), GFP_ATOMIC)) == NULL) { return -ENOMEM; } memset(new_slave, 0, sizeof(slave_t)); - /* save flags before call to netdev_set_master */ + /* save slave's original flags before calling */ + /* netdev_set_master and dev_open */ new_slave->original_flags = slave_dev->flags; + + /* save slave's original ("permanent") mac address for + modes that needs it, and for restoring it upon release, + and then set it to the master's address */ + memcpy(new_slave->perm_hwaddr, slave_dev->dev_addr, ETH_ALEN); + + if (bond->next != (slave_t*)bond) { + /* set slave to master's mac address + The application already set the master's + mac address to that of the first slave */ + memcpy(addr.sa_data, master_dev->dev_addr, ETH_ALEN); + addr.sa_family = slave_dev->type; + err = slave_dev->set_mac_address(slave_dev, &addr); + if (err) { +#ifdef BONDING_DEBUG + printk(KERN_CRIT "Error %d calling set_mac_address\n", err); +#endif + goto err_free; + } + } + + /* open the slave since the application closed it */ + err = dev_open(slave_dev); + if (err) { +#ifdef BONDING_DEBUG + printk(KERN_CRIT "Openning slave %s failed\n", slave_dev->name); +#endif + goto err_restore_mac; + } + err = netdev_set_master(slave_dev, master_dev); if (err) { #ifdef BONDING_DEBUG printk(KERN_CRIT "Error %d calling netdev_set_master\n", err); #endif - goto err_free; + goto err_close; } new_slave->dev = slave_dev; @@ -1285,39 +1343,6 @@ static int bond_enslave(struct net_devic write_unlock_irqrestore(&bond->lock, flags); - /* - * !!! This is to support old versions of ifenslave. We can remove - * this in 2.5 because our ifenslave takes care of this for us. - * We check to see if the master has a mac address yet. If not, - * we'll give it the mac address of our slave device. - */ - for (ndx = 0; ndx < slave_dev->addr_len; ndx++) { -#ifdef BONDING_DEBUG - printk(KERN_CRIT "Checking ndx=%d of master_dev->dev_addr\n", - ndx); -#endif - if (master_dev->dev_addr[ndx] != 0) { -#ifdef BONDING_DEBUG - printk(KERN_CRIT "Found non-zero byte at ndx=%d\n", - ndx); -#endif - break; - } - } - if (ndx == slave_dev->addr_len) { - /* - * We got all the way through the address and it was - * all 0's. - */ -#ifdef BONDING_DEBUG - printk(KERN_CRIT "%s doesn't have a MAC address yet. ", - master_dev->name); - printk(KERN_CRIT "Going to give assign it from %s.\n", - slave_dev->name); -#endif - bond_sethwaddr(master_dev, slave_dev); - } - printk (KERN_INFO "%s: enslaving %s as a%s interface with a%s link.\n", master_dev->name, slave_dev->name, new_slave->state == BOND_STATE_ACTIVE ? "n active" : " backup", @@ -1325,6 +1350,16 @@ static int bond_enslave(struct net_devic //enslave is successfull return 0; + +// Undo stages on error +err_close: + dev_close(slave_dev); + +err_restore_mac: + memcpy(addr.sa_data, new_slave->perm_hwaddr, ETH_ALEN); + addr.sa_family = slave_dev->type; + slave_dev->set_mac_address(slave_dev, &addr); + err_free: kfree(new_slave); return err; @@ -1539,6 +1574,7 @@ static int bond_release(struct net_devic bonding_t *bond; slave_t *our_slave, *old_current; unsigned long flags; + struct sockaddr addr; if (master == NULL || slave == NULL) { return -ENODEV; @@ -1612,21 +1648,29 @@ static int bond_release(struct net_devic netdev_set_master(slave, NULL); - /* only restore its RUNNING flag if monitoring set it down */ - if (slave->flags & IFF_UP) { - slave->flags |= IFF_RUNNING; - } + /* close slave before restoring its mac address */ + dev_close(slave); - if (slave->flags & IFF_NOARP || - bond->current_slave != NULL) { - dev_close(slave); - our_slave->original_flags &= ~IFF_UP; + /* restore original ("permanent") mac address*/ + memcpy(addr.sa_data, our_slave->perm_hwaddr, ETH_ALEN); + addr.sa_family = slave->type; + slave->set_mac_address(slave, &addr); + + /* restore the original state of the IFF_NOARP flag that might have */ + /* been set by bond_set_slave_inactive_flags() */ + if ((our_slave->original_flags & IFF_NOARP) == 0) { + slave->flags &= ~IFF_NOARP; } - bond_restore_slave_flags(our_slave); - kfree(our_slave); + /* if the last slave was removed, zero the mac address + of the master so it will be set by the application + to the mac address of the first slave */ + if (bond->next == (slave_t*)bond) { + memset(master->dev_addr, 0, master->addr_len); + } + return 0; /* deletion OK */ } @@ -1639,6 +1683,7 @@ static int bond_release_all(struct net_d bonding_t *bond; slave_t *our_slave; struct net_device *slave_dev; + struct sockaddr addr; if (master == NULL) { return -ENODEV; @@ -1673,21 +1718,33 @@ static int bond_release_all(struct net_d dev_set_allmulti(slave_dev, -1); } - kfree(our_slave); - /* Can be safely called from inside the bond lock since traffic and timers have already stopped */ netdev_set_master(slave_dev, NULL); - /* only restore its RUNNING flag if monitoring set it down */ - if (slave_dev->flags & IFF_UP) - slave_dev->flags |= IFF_RUNNING; + /* close slave before restoring its mac address */ + dev_close(slave_dev); + + /* restore original ("permanent") mac address*/ + memcpy(addr.sa_data, our_slave->perm_hwaddr, ETH_ALEN); + addr.sa_family = slave_dev->type; + slave_dev->set_mac_address(slave_dev, &addr); + + /* restore the original state of the IFF_NOARP flag that might have */ + /* been set by bond_set_slave_inactive_flags() */ + if ((our_slave->original_flags & IFF_NOARP) == 0) { + slave_dev->flags &= ~IFF_NOARP; + } - if (slave_dev->flags & IFF_NOARP) - dev_close(slave_dev); + kfree(our_slave); } + /* zero the mac address of the master so it will be + set by the application to the mac address of the + first slave */ + memset(master->dev_addr, 0, master->addr_len); + printk (KERN_INFO "%s: released all slaves\n", master->name); return 0; @@ -2904,6 +2961,15 @@ static int bond_get_info(char *buf, char "up\n" : "down\n"); len += sprintf(buf + len, "Link Failure Count: %d\n", slave->link_failure_count); + + len += sprintf(buf + len, + "Permanent HW addr: %02x:%02x:%02x:%02x:%02x:%02x\n", + slave->perm_hwaddr[0], + slave->perm_hwaddr[1], + slave->perm_hwaddr[2], + slave->perm_hwaddr[3], + slave->perm_hwaddr[4], + slave->perm_hwaddr[5]); } read_unlock_irqrestore(&bond->lock, flags); diff -Nuarp linux-2.4.20-bonding-20030317/include/linux/if_bonding.h linux-2.4.20-bonding-20030317-devel/include/linux/if_bonding.h --- linux-2.4.20-bonding-20030317/include/linux/if_bonding.h 2003-03-18 17:03:28.000000000 +0200 +++ linux-2.4.20-bonding-20030317-devel/include/linux/if_bonding.h 2003-03-18 17:03:28.000000000 +0200 @@ -14,6 +14,11 @@ * 2003/03/18 - Amir Noam * - Added support for getting slave's speed and duplex via ethtool. * Needed for 802.3ad and other future modes. + * + * 2003/03/18 - Tsippy Mendelson and + * Shmulik Hen + * - Enable support of modes that need to use the unique mac address of + * each slave. */ #ifndef _LINUX_IF_BONDING_H @@ -94,6 +99,7 @@ typedef struct slave { u32 link_failure_count; u16 speed; u8 duplex; + u8 perm_hwaddr[ETH_ALEN]; } slave_t; /* -- | Shmulik Hen | | Israel Design Center (Jerusalem) | | LAN Access Division | | Intel Communications Group, Intel corp. | | | | Anti-Spam: shmulik dot hen at intel dot com | From hshmulik@intel.com Thu Mar 20 07:18:00 2003 Received: with ECARTIS (v1.0.0; list netdev); Thu, 20 Mar 2003 07:18:07 -0800 (PST) Received: from hermes.fm.intel.com (fmr01.intel.com [192.55.52.18]) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h2KFI0q9026416 for ; Thu, 20 Mar 2003 07:18:00 -0800 Received: from talaria.fm.intel.com (talaria.fm.intel.com [10.1.192.39]) by hermes.fm.intel.com (8.11.6/8.11.6/d: outer.mc,v 1.51 2002/09/23 20:43:23 dmccart Exp $) with ESMTP id h2KFETf10650 for ; Thu, 20 Mar 2003 15:14:30 GMT Received: from fmsmsxv040-1.fm.intel.com (fmsmsxvs040.fm.intel.com [132.233.42.124]) by talaria.fm.intel.com (8.11.6/8.11.6/d: inner.mc,v 1.28 2003/01/13 19:44:39 dmccart Exp $) with SMTP id h2KFJFc09412 for ; Thu, 20 Mar 2003 15:19:15 GMT Received: from jrslxjul4.npdj.intel.com ([10.12.254.188]) by fmsmsxv040-1.fm.intel.com (NAVGW 2.5.2.11) with SMTP id M2003032007180220753 ; Thu, 20 Mar 2003 07:18:04 -0800 Date: Thu, 20 Mar 2003 17:17:46 +0200 (IST) From: Shmulik Hen X-X-Sender: hshmulik@jrslxjul4.npdj.intel.com To: Bonding Developement list , Bonding Announce list , Linux Net Mailing list , Linux Kernel Mailing list , Oss SGI Netdev list , Jeff Garzik Subject: [patch] (7/8) Add 802.3ad support to bonding Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) X-archive-position: 1994 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: hshmulik@intel.com Precedence: bulk X-list: netdev Content-Length: 6313 Lines: 206 This patch moves the driver's private data types from include/linux/if_bonding.h to the local bonding.h. This patch is against bonding 2.4.20-20030317. diff -Nuarp linux-2.4.20-bonding-20030317/drivers/net/bonding/bonding.h linux-2.4.20-bonding-20030317-devel/drivers/net/bonding/bonding.h --- linux-2.4.20-bonding-20030317/drivers/net/bonding/bonding.h 1970-01-01 02:00:00.000000000 +0200 +++ linux-2.4.20-bonding-20030317-devel/drivers/net/bonding/bonding.h 2003-03-18 17:03:31.000000000 +0200 @@ -0,0 +1,68 @@ +/* + * Bond several ethernet interfaces into a Cisco, running 'Etherchannel'. + * + * Portions are (c) Copyright 1995 Simon "Guru Aleph-Null" Janes + * NCM: Network and Communications Management, Inc. + * + * BUT, I'm the one who modified it for ethernet, so: + * (c) Copyright 1999, Thomas Davis, tadavis@lbl.gov + * + * This software may be used and distributed according to the terms + * of the GNU Public License, incorporated herein by reference. + * + */ + +#ifndef _LINUX_BONDING_H +#define _LINUX_BONDING_H + +#include +#include + +typedef struct slave { + struct slave *next; + struct slave *prev; + struct net_device *dev; + short delay; + unsigned long jiffies; + char link; /* one of BOND_LINK_XXXX */ + char state; /* one of BOND_STATE_XXXX */ + unsigned short original_flags; + u32 link_failure_count; + u16 speed; + u8 duplex; + u8 perm_hwaddr[ETH_ALEN]; +} slave_t; + +/* + * Here are the locking policies for the two bonding locks: + * + * 1) Get bond->lock when reading/writing slave list. + * 2) Get bond->ptrlock when reading/writing bond->current_slave. + * (It is unnecessary when the write-lock is put with bond->lock.) + * 3) When we lock with bond->ptrlock, we must lock with bond->lock + * beforehand. + */ +typedef struct bonding { + slave_t *next; + slave_t *prev; + slave_t *current_slave; + slave_t *primary_slave; + slave_t *current_arp_slave; + __s32 slave_cnt; + rwlock_t lock; + rwlock_t ptrlock; + struct timer_list mii_timer; + struct timer_list arp_timer; + struct net_device_stats *stats; +#ifdef CONFIG_PROC_FS + struct proc_dir_entry *bond_proc_dir; + struct proc_dir_entry *bond_proc_info_file; +#endif /* CONFIG_PROC_FS */ + struct bonding *next_bond; + struct net_device *device; + struct dev_mc_list *mc_list; + unsigned short flags; +} bonding_t; + +#endif /* _LINUX_BONDING_H */ + diff -Nuarp linux-2.4.20-bonding-20030317/drivers/net/bonding/bond_main.c linux-2.4.20-bonding-20030317-devel/drivers/net/bonding/bond_main.c --- linux-2.4.20-bonding-20030317/drivers/net/bonding/bond_main.c 2003-03-18 17:03:30.000000000 +0200 +++ linux-2.4.20-bonding-20030317-devel/drivers/net/bonding/bond_main.c 2003-03-18 17:03:31.000000000 +0200 @@ -358,6 +358,7 @@ #include #include #include +#include "bonding.h" #define DRV_VERSION "2.4.20-20030317" #define DRV_RELDATE "March 17, 2003" @@ -380,6 +381,11 @@ DRV_NAME ".c:v" DRV_VERSION " (" DRV_REL #define MAX_ARP_IP_TARGETS 16 #endif +struct bond_parm_tbl { + char *modename; + int mode; +}; + static int arp_interval = BOND_LINK_ARP_INTERV; static char *arp_ip_target[MAX_ARP_IP_TARGETS] = { NULL, }; static unsigned long arp_target[MAX_ARP_IP_TARGETS] = { 0, } ; diff -Nuarp linux-2.4.20-bonding-20030317/include/linux/if_bonding.h linux-2.4.20-bonding-20030317-devel/include/linux/if_bonding.h --- linux-2.4.20-bonding-20030317/include/linux/if_bonding.h 2003-03-18 17:03:30.000000000 +0200 +++ linux-2.4.20-bonding-20030317-devel/include/linux/if_bonding.h 2003-03-18 17:03:31.000000000 +0200 @@ -19,18 +19,18 @@ * Shmulik Hen * - Enable support of modes that need to use the unique mac address of * each slave. + * + * 2003/03/18 - Tsippy Mendelson and + * Amir Noam + * - Moved driver's private data types to bonding.h */ #ifndef _LINUX_IF_BONDING_H #define _LINUX_IF_BONDING_H -#ifdef __KERNEL__ -#include #include -#include -#endif /* __KERNEL__ */ - #include +#include /* * We can remove these ioctl definitions in 2.5. People should use the @@ -66,11 +66,6 @@ #define BOND_MULTICAST_ACTIVE 1 #define BOND_MULTICAST_ALL 2 -struct bond_parm_tbl { - char *modename; - int mode; -}; - typedef struct ifbond { __s32 bond_mode; __s32 num_slaves; @@ -86,55 +81,7 @@ typedef struct ifslave __u32 link_failure_count; } ifslave; -#ifdef __KERNEL__ -typedef struct slave { - struct slave *next; - struct slave *prev; - struct net_device *dev; - short delay; - unsigned long jiffies; - char link; /* one of BOND_LINK_XXXX */ - char state; /* one of BOND_STATE_XXXX */ - unsigned short original_flags; - u32 link_failure_count; - u16 speed; - u8 duplex; - u8 perm_hwaddr[ETH_ALEN]; -} slave_t; - -/* - * Here are the locking policies for the two bonding locks: - * - * 1) Get bond->lock when reading/writing slave list. - * 2) Get bond->ptrlock when reading/writing bond->current_slave. - * (It is unnecessary when the write-lock is put with bond->lock.) - * 3) When we lock with bond->ptrlock, we must lock with bond->lock - * beforehand. - */ -typedef struct bonding { - slave_t *next; - slave_t *prev; - slave_t *current_slave; - slave_t *primary_slave; - slave_t *current_arp_slave; - __s32 slave_cnt; - rwlock_t lock; - rwlock_t ptrlock; - struct timer_list mii_timer; - struct timer_list arp_timer; - struct net_device_stats *stats; -#ifdef CONFIG_PROC_FS - struct proc_dir_entry *bond_proc_dir; - struct proc_dir_entry *bond_proc_info_file; -#endif /* CONFIG_PROC_FS */ - struct bonding *next_bond; - struct net_device *device; - struct dev_mc_list *mc_list; - unsigned short flags; -} bonding_t; -#endif /* __KERNEL__ */ - -#endif /* _LINUX_BOND_H */ +#endif /* _LINUX_IF_BONDING_H */ /* * Local variables: -- | Shmulik Hen | | Israel Design Center (Jerusalem) | | LAN Access Division | | Intel Communications Group, Intel corp. | | | | Anti-Spam: shmulik dot hen at intel dot com | From hshmulik@intel.com Thu Mar 20 07:18:21 2003 Received: with ECARTIS (v1.0.0; list netdev); Thu, 20 Mar 2003 07:18:42 -0800 (PST) Received: from caduceus.fm.intel.com (fmr02.intel.com [192.55.52.25]) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h2KFHqq9026366 for ; Thu, 20 Mar 2003 07:17:53 -0800 Received: from talaria.fm.intel.com (talaria.fm.intel.com [10.1.192.39]) by caduceus.fm.intel.com (8.11.6/8.11.6/d: outer.mc,v 1.51 2002/09/23 20:43:23 dmccart Exp $) with ESMTP id h2KFB4607130 for ; Thu, 20 Mar 2003 15:11:08 GMT Received: from fmsmsxv040-1.fm.intel.com (fmsmsxvs040.fm.intel.com [132.233.42.124]) by talaria.fm.intel.com (8.11.6/8.11.6/d: inner.mc,v 1.28 2003/01/13 19:44:39 dmccart Exp $) with SMTP id h2KFIvc09227 for ; Thu, 20 Mar 2003 15:18:57 GMT Received: from jrslxjul4.npdj.intel.com ([10.12.254.188]) by fmsmsxv040-1.fm.intel.com (NAVGW 2.5.2.11) with SMTP id M2003032007174120925 ; Thu, 20 Mar 2003 07:17:43 -0800 Date: Thu, 20 Mar 2003 17:17:25 +0200 (IST) From: Shmulik Hen X-X-Sender: hshmulik@jrslxjul4.npdj.intel.com To: Bonding Developement list , Bonding Announce list , Linux Net Mailing list , Linux Kernel Mailing list , Oss SGI Netdev list , Jeff Garzik Subject: [patch] (6/8) Add 802.3ad support to bonding Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) X-archive-position: 1995 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: hshmulik@intel.com Precedence: bulk X-list: netdev Content-Length: 215563 Lines: 6940 This patch enables support for multiple files in the project. It moves bonding.c to a sub directory of it's own (drivers/net/bonding/) and renames it to bond_main.c. This patch is against bonding 2.4.20-20030317. diff -Nuarp linux-2.4.20-bonding-20030317/drivers/net/bonding/bond_main.c linux-2.4.20-bonding-20030317-devel/drivers/net/bonding/bond_main.c --- linux-2.4.20-bonding-20030317/drivers/net/bonding/bond_main.c 1970-01-01 02:00:00.000000000 +0200 +++ linux-2.4.20-bonding-20030317-devel/drivers/net/bonding/bond_main.c 2003-03-18 17:03:29.000000000 +0200 @@ -0,0 +1,3434 @@ +/* + * originally based on the dummy device. + * + * Copyright 1999, Thomas Davis, tadavis@lbl.gov. + * Licensed under the GPL. Based on dummy.c, and eql.c devices. + * + * bonding.c: an Ethernet Bonding driver + * + * This is useful to talk to a Cisco EtherChannel compatible equipment: + * Cisco 5500 + * Sun Trunking (Solaris) + * Alteon AceDirector Trunks + * Linux Bonding + * and probably many L2 switches ... + * + * How it works: + * ifconfig bond0 ipaddress netmask up + * will setup a network device, with an ip address. No mac address + * will be assigned at this time. The hw mac address will come from + * the first slave bonded to the channel. All slaves will then use + * this hw mac address. + * + * ifconfig bond0 down + * will release all slaves, marking them as down. + * + * ifenslave bond0 eth0 + * will attach eth0 to bond0 as a slave. eth0 hw mac address will either + * a: be used as initial mac address + * b: if a hw mac address already is there, eth0's hw mac address + * will then be set from bond0. + * + * v0.1 - first working version. + * v0.2 - changed stats to be calculated by summing slaves stats. + * + * Changes: + * Arnaldo Carvalho de Melo + * - fix leaks on failure at bond_init + * + * 2000/09/30 - Willy Tarreau + * - added trivial code to release a slave device. + * - fixed security bug (CAP_NET_ADMIN not checked) + * - implemented MII link monitoring to disable dead links : + * All MII capable slaves are checked every milliseconds + * (100 ms seems good). This value can be changed by passing it to + * insmod. A value of zero disables the monitoring (default). + * - fixed an infinite loop in bond_xmit_roundrobin() when there's no + * good slave. + * - made the code hopefully SMP safe + * + * 2000/10/03 - Willy Tarreau + * - optimized slave lists based on relevant suggestions from Thomas Davis + * - implemented active-backup method to obtain HA with two switches: + * stay as long as possible on the same active interface, while we + * also monitor the backup one (MII link status) because we want to know + * if we are able to switch at any time. ( pass "mode=1" to insmod ) + * - lots of stress testings because we need it to be more robust than the + * wires ! :-> + * + * 2000/10/09 - Willy Tarreau + * - added up and down delays after link state change. + * - optimized the slaves chaining so that when we run forward, we never + * repass through the bond itself, but we can find it by searching + * backwards. Renders the deletion more difficult, but accelerates the + * scan. + * - smarter enslaving and releasing. + * - finer and more robust SMP locking + * + * 2000/10/17 - Willy Tarreau + * - fixed two potential SMP race conditions + * + * 2000/10/18 - Willy Tarreau + * - small fixes to the monitoring FSM in case of zero delays + * 2000/11/01 - Willy Tarreau + * - fixed first slave not automatically used in trunk mode. + * 2000/11/10 : spelling of "EtherChannel" corrected. + * 2000/11/13 : fixed a race condition in case of concurrent accesses to ioctl(). + * 2000/12/16 : fixed improper usage of rtnl_exlock_nowait(). + * + * 2001/1/3 - Chad N. Tindel + * - The bonding driver now simulates MII status monitoring, just like + * a normal network device. It will show that the link is down iff + * every slave in the bond shows that their links are down. If at least + * one slave is up, the bond's MII status will appear as up. + * + * 2001/2/7 - Chad N. Tindel + * - Applications can now query the bond from user space to get + * information which may be useful. They do this by calling + * the BOND_INFO_QUERY ioctl. Once the app knows how many slaves + * are in the bond, it can call the BOND_SLAVE_INFO_QUERY ioctl to + * get slave specific information (# link failures, etc). See + * for more details. The structs of interest + * are ifbond and ifslave. + * + * 2001/4/5 - Chad N. Tindel + * - Ported to 2.4 Kernel + * + * 2001/5/2 - Jeffrey E. Mast + * - When a device is detached from a bond, the slave device is no longer + * left thinking that is has a master. + * + * 2001/5/16 - Jeffrey E. Mast + * - memset did not appropriately initialized the bond rw_locks. Used + * rwlock_init to initialize to unlocked state to prevent deadlock when + * first attempting a lock + * - Called SET_MODULE_OWNER for bond device + * + * 2001/5/17 - Tim Anderson + * - 2 paths for releasing for slave release; 1 through ioctl + * and 2) through close. Both paths need to release the same way. + * - the free slave in bond release is changing slave status before + * the free. The netdev_set_master() is intended to change slave state + * so it should not be done as part of the release process. + * - Simple rule for slave state at release: only the active in A/B and + * only one in the trunked case. + * + * 2001/6/01 - Tim Anderson + * - Now call dev_close when releasing a slave so it doesn't screw up + * out routing table. + * + * 2001/6/01 - Chad N. Tindel + * - Added /proc support for getting bond and slave information. + * Information is in /proc/net//info. + * - Changed the locking when calling bond_close to prevent deadlock. + * + * 2001/8/05 - Janice Girouard + * - correct problem where refcnt of slave is not incremented in bond_ioctl + * so the system hangs when halting. + * - correct locking problem when unable to malloc in bond_enslave. + * - adding bond_xmit_xor logic. + * - adding multiple bond device support. + * + * 2001/8/13 - Erik Habbinga + * - correct locking problem with rtnl_exlock_nowait + * + * 2001/8/23 - Janice Girouard + * - bzero initial dev_bonds, to correct oops + * - convert SIOCDEVPRIVATE to new MII ioctl calls + * + * 2001/9/13 - Takao Indoh + * - Add the BOND_CHANGE_ACTIVE ioctl implementation + * + * 2001/9/14 - Mark Huth + * - Change MII_LINK_READY to not check for end of auto-negotiation, + * but only for an up link. + * + * 2001/9/20 - Chad N. Tindel + * - Add the device field to bonding_t. Previously the net_device + * corresponding to a bond wasn't available from the bonding_t + * structure. + * + * 2001/9/25 - Janice Girouard + * - add arp_monitor for active backup mode + * + * 2001/10/23 - Takao Indoh + * - Various memory leak fixes + * + * 2001/11/5 - Mark Huth + * - Don't take rtnl lock in bond_mii_monitor as it deadlocks under + * certain hotswap conditions. + * Note: this same change may be required in bond_arp_monitor ??? + * - Remove possibility of calling bond_sethwaddr with NULL slave_dev ptr + * - Handle hot swap ethernet interface deregistration events to remove + * kernel oops following hot swap of enslaved interface + * + * 2002/1/2 - Chad N. Tindel + * - Restore original slave flags at release time. + * + * 2002/02/18 - Erik Habbinga + * - bond_release(): calling kfree on our_slave after call to + * bond_restore_slave_flags, not before + * - bond_enslave(): saving slave flags into original_flags before + * call to netdev_set_master, so the IFF_SLAVE flag doesn't end + * up in original_flags + * + * 2002/04/05 - Mark Smith and + * Steve Mead + * - Port Gleb Natapov's multicast support patchs from 2.4.12 + * to 2.4.18 adding support for multicast. + * + * 2002/06/10 - Tony Cureington + * - corrected uninitialized pointer (ifr.ifr_data) in bond_check_dev_link; + * actually changed function to use MIIPHY, then MIIREG, and finally + * ETHTOOL to determine the link status + * - fixed bad ifr_data pointer assignments in bond_ioctl + * - corrected mode 1 being reported as active-backup in bond_get_info; + * also added text to distinguish type of load balancing (rr or xor) + * - change arp_ip_target module param from "1-12s" (array of 12 ptrs) + * to "s" (a single ptr) + * + * 2002/08/30 - Jay Vosburgh + * - Removed acquisition of xmit_lock in set_multicast_list; caused + * deadlock on SMP (lock is held by caller). + * - Revamped SIOCGMIIPHY, SIOCGMIIREG portion of bond_check_dev_link(). + * + * 2002/09/18 - Jay Vosburgh + * - Fixed up bond_check_dev_link() (and callers): removed some magic + * numbers, banished local MII_ defines, wrapped ioctl calls to + * prevent EFAULT errors + * + * 2002/9/30 - Jay Vosburgh + * - make sure the ip target matches the arp_target before saving the + * hw address. + * + * 2002/9/30 - Dan Eisner + * - make sure my_ip is set before taking down the link, since + * not all switches respond if the source ip is not set. + * + * 2002/10/8 - Janice Girouard + * - read in the local ip address when enslaving a device + * - add primary support + * - make sure 2*arp_interval has passed when a new device + * is brought on-line before taking it down. + * + * 2002/09/11 - Philippe De Muyter + * - Added bond_xmit_broadcast logic. + * - Added bond_mode() support function. + * + * 2002/10/26 - Laurent Deniel + * - allow to register multicast addresses only on active slave + * (useful in active-backup mode) + * - add multicast module parameter + * - fix deletion of multicast groups after unloading module + * + * 2002/11/06 - Kameshwara Rayaprolu + * - Changes to prevent panic from closing the device twice; if we close + * the device in bond_release, we must set the original_flags to down + * so it won't be closed again by the network layer. + * + * 2002/11/07 - Tony Cureington + * - Fix arp_target_hw_addr memory leak + * - Created activebackup_arp_monitor function to handle arp monitoring + * in active backup mode - the bond_arp_monitor had several problems... + * such as allowing slaves to tx arps sequentially without any delay + * for a response + * - Renamed bond_arp_monitor to loadbalance_arp_monitor and re-wrote + * this function to just handle arp monitoring in load-balancing mode; + * it is a lot more compact now + * - Changes to ensure one and only one slave transmits in active-backup + * mode + * - Robustesize parameters; warn users about bad combinations of + * parameters; also if miimon is specified and a network driver does + * not support MII or ETHTOOL, inform the user of this + * - Changes to support link_failure_count when in arp monitoring mode + * - Fix up/down delay reported in /proc + * - Added version; log version; make version available from "modinfo -d" + * - Fixed problem in bond_check_dev_link - if the first IOCTL (SIOCGMIIPH) + * failed, the ETHTOOL ioctl never got a chance + * + * 2002/11/16 - Laurent Deniel + * - fix multicast handling in activebackup_arp_monitor + * - remove one unnecessary and confusing current_slave == slave test + * in activebackup_arp_monitor + * + * 2002/11/17 - Laurent Deniel + * - fix bond_slave_info_query when slave_id = num_slaves + * + * 2002/11/19 - Janice Girouard + * - correct ifr_data reference. Update ifr_data reference + * to mii_ioctl_data struct values to avoid confusion. + * + * 2002/11/22 - Bert Barbe + * - Add support for multiple arp_ip_target + * + * 2002/12/13 - Jay Vosburgh + * - Changed to allow text strings for mode and multicast, e.g., + * insmod bonding mode=active-backup. The numbers still work. + * One change: an invalid choice will cause module load failure, + * rather than the previous behavior of just picking one. + * - Minor cleanups; got rid of dup ctype stuff, atoi function + * + * 2003/02/07 - Jay Vosburgh + * - Added use_carrier module parameter that causes miimon to + * use netif_carrier_ok() test instead of MII/ETHTOOL ioctls. + * - Minor cleanups; consolidated ioctl calls to one function. + * + * 2003/02/07 - Tony Cureington + * - Fix bond_mii_monitor() logic error that could result in + * bonding round-robin mode ignoring links after failover/recovery + * + * 2003/03/17 - Jay Vosburgh + * - kmalloc fix (GFP_KERNEL to GFP_ATOMIC) reported by + * Shmulik dot Hen at intel.com. + * - Based on discussion on mailing list, changed use of + * update_slave_cnt(), created wrapper functions for adding/removing + * slaves, changed bond_xmit_xor() to check slave_cnt instead of + * checking slave and slave->dev (which only worked by accident). + * - Misc code cleanup: get arp_send() prototype from header file, + * add max_bonds to bonding.txt. + * + * 2003/03/18 - Tsippy Mendelson and + * Shmulik Hen + * - Make sure only bond_attach_slave() and bond_detach_slave() can + * manipulate the slave list, including slave_cnt, even when in + * bond_release_all(). + * - Fixed hang in bond_release() while traffic is running. + * netdev_set_master() must not be called from within the bond lock. + * + * 2003/03/18 - Tsippy Mendelson and + * Shmulik Hen + * - Fixed hang in bond_enslave(): netdev_set_master() must not be + * called from within the bond lock while traffic is running. + * + * 2003/03/18 - Amir Noam + * - Added support for getting slave's speed and duplex via ethtool. + * Needed for 802.3ad and other future modes. + * + * 2003/03/18 - Tsippy Mendelson and + * Shmulik Hen + * - Enable support of modes that need to use the unique mac address of + * each slave. + * * bond_enslave(): Moved setting the slave's mac address, and + * openning it, from the application to the driver. This breaks + * backward comaptibility with old versions of ifenslave that open + * the slave before enalsving it !!!. + * * bond_release(): The driver also takes care of closing the slave + * and restoring its original mac address. + * - Removed the code that restores all base driver's flags. + * Flags are automatically restored once all undo stages are done + * properly. + * - Block possibility of enslaving before the master is up. This + * prevents putting the system in an unstable state. + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include +#include +#include +#include +#include +#include + +#include +#include +#include +#include +#include +#include + +#define DRV_VERSION "2.4.20-20030317" +#define DRV_RELDATE "March 17, 2003" +#define DRV_NAME "bonding" +#define DRV_DESCRIPTION "Ethernet Channel Bonding Driver" + +static const char *version = +DRV_NAME ".c:v" DRV_VERSION " (" DRV_RELDATE ")\n"; + +/* monitor all links that often (in milliseconds). <=0 disables monitoring */ +#ifndef BOND_LINK_MON_INTERV +#define BOND_LINK_MON_INTERV 0 +#endif + +#ifndef BOND_LINK_ARP_INTERV +#define BOND_LINK_ARP_INTERV 0 +#endif + +#ifndef MAX_ARP_IP_TARGETS +#define MAX_ARP_IP_TARGETS 16 +#endif + +static int arp_interval = BOND_LINK_ARP_INTERV; +static char *arp_ip_target[MAX_ARP_IP_TARGETS] = { NULL, }; +static unsigned long arp_target[MAX_ARP_IP_TARGETS] = { 0, } ; +static int arp_ip_count = 0; +static u32 my_ip = 0; +char *arp_target_hw_addr = NULL; + +static char *primary= NULL; + +static int max_bonds = BOND_DEFAULT_MAX_BONDS; +static int miimon = BOND_LINK_MON_INTERV; +static int use_carrier = 1; +static int bond_mode = BOND_MODE_ROUNDROBIN; +static int updelay = 0; +static int downdelay = 0; + +static char *mode = NULL; + +static struct bond_parm_tbl bond_mode_tbl[] = { +{ "balance-rr", BOND_MODE_ROUNDROBIN}, +{ "active-backup", BOND_MODE_ACTIVEBACKUP}, +{ "balance-xor", BOND_MODE_XOR}, +{ "broadcast", BOND_MODE_BROADCAST}, +{ NULL, -1}, +}; + +static int multicast_mode = BOND_MULTICAST_ALL; +static char *multicast = NULL; + +static struct bond_parm_tbl bond_mc_tbl[] = { +{ "disabled", BOND_MULTICAST_DISABLED}, +{ "active", BOND_MULTICAST_ACTIVE}, +{ "all", BOND_MULTICAST_ALL}, +{ NULL, -1}, +}; + +static int first_pass = 1; +static struct bonding *these_bonds = NULL; +static struct net_device *dev_bonds = NULL; + +MODULE_PARM(max_bonds, "i"); +MODULE_PARM_DESC(max_bonds, "Max number of bonded devices"); +MODULE_PARM(miimon, "i"); +MODULE_PARM_DESC(miimon, "Link check interval in milliseconds"); +MODULE_PARM(use_carrier, "i"); +MODULE_PARM_DESC(use_carrier, "Use netif_carrier_ok (vs MII ioctls) in miimon; 09 for off, 1 for on (default)"); +MODULE_PARM(mode, "s"); +MODULE_PARM_DESC(mode, "Mode of operation : 0 for round robin, 1 for active-backup, 2 for xor"); +MODULE_PARM(arp_interval, "i"); +MODULE_PARM_DESC(arp_interval, "arp interval in milliseconds"); +MODULE_PARM(arp_ip_target, "1-" __MODULE_STRING(MAX_ARP_IP_TARGETS) "s"); +MODULE_PARM_DESC(arp_ip_target, "arp targets in n.n.n.n form"); +MODULE_PARM(updelay, "i"); +MODULE_PARM_DESC(updelay, "Delay before considering link up, in milliseconds"); +MODULE_PARM(downdelay, "i"); +MODULE_PARM_DESC(downdelay, "Delay before considering link down, in milliseconds"); +MODULE_PARM(primary, "s"); +MODULE_PARM_DESC(primary, "Primary network device to use"); +MODULE_PARM(multicast, "s"); +MODULE_PARM_DESC(multicast, "Mode for multicast support : 0 for none, 1 for active slave, 2 for all slaves (default)"); + +static int bond_xmit_roundrobin(struct sk_buff *skb, struct net_device *dev); +static int bond_xmit_xor(struct sk_buff *skb, struct net_device *dev); +static int bond_xmit_activebackup(struct sk_buff *skb, struct net_device *dev); +static struct net_device_stats *bond_get_stats(struct net_device *dev); +static void bond_mii_monitor(struct net_device *dev); +static void loadbalance_arp_monitor(struct net_device *dev); +static void activebackup_arp_monitor(struct net_device *dev); +static int bond_event(struct notifier_block *this, unsigned long event, void *ptr); +static void bond_mc_list_destroy(struct bonding *bond); +static void bond_mc_add(bonding_t *bond, void *addr, int alen); +static void bond_mc_delete(bonding_t *bond, void *addr, int alen); +static int bond_mc_list_copy (struct dev_mc_list *src, struct bonding *dst, int gpf_flag); +static inline int dmi_same(struct dev_mc_list *dmi1, struct dev_mc_list *dmi2); +static void bond_set_promiscuity(bonding_t *bond, int inc); +static void bond_set_allmulti(bonding_t *bond, int inc); +static struct dev_mc_list* bond_mc_list_find_dmi(struct dev_mc_list *dmi, struct dev_mc_list *mc_list); +static void bond_mc_update(bonding_t *bond, slave_t *new, slave_t *old); +static void bond_set_slave_inactive_flags(slave_t *slave); +static void bond_set_slave_active_flags(slave_t *slave); +static int bond_enslave(struct net_device *master, struct net_device *slave); +static int bond_release(struct net_device *master, struct net_device *slave); +static int bond_release_all(struct net_device *master); +static int bond_sethwaddr(struct net_device *master, struct net_device *slave); + +/* + * bond_get_info is the interface into the /proc filesystem. This is + * a different interface than the BOND_INFO_QUERY ioctl. That is done + * through the generic networking ioctl interface, and bond_info_query + * is the internal function which provides that information. + */ +static int bond_get_info(char *buf, char **start, off_t offset, int length); + +/* #define BONDING_DEBUG 1 */ + +/* several macros */ + +#define IS_UP(dev) ((((dev)->flags & (IFF_UP)) == (IFF_UP)) && \ + (netif_running(dev) && netif_carrier_ok(dev))) + +static void arp_send_all(slave_t *slave) +{ + int i; + + for (i = 0; (idev, + my_ip, arp_target_hw_addr, slave->dev->dev_addr, + arp_target_hw_addr); + } +} + + +static const char * +bond_mode_name(void) +{ + switch (bond_mode) { + case BOND_MODE_ROUNDROBIN : + return "load balancing (round-robin)"; + case BOND_MODE_ACTIVEBACKUP : + return "fault-tolerance (active-backup)"; + case BOND_MODE_XOR : + return "load balancing (xor)"; + case BOND_MODE_BROADCAST : + return "fault-tolerance (broadcast)"; + default : + return "unknown"; + } +} + +static const char * +multicast_mode_name(void) +{ + switch(multicast_mode) { + case BOND_MULTICAST_DISABLED : + return "disabled"; + case BOND_MULTICAST_ACTIVE : + return "active slave only"; + case BOND_MULTICAST_ALL : + return "all slaves"; + default : + return "unknown"; + } +} + +static void bond_set_slave_inactive_flags(slave_t *slave) +{ + slave->state = BOND_STATE_BACKUP; + slave->dev->flags |= IFF_NOARP; +} + +static void bond_set_slave_active_flags(slave_t *slave) +{ + slave->state = BOND_STATE_ACTIVE; + slave->dev->flags &= ~IFF_NOARP; +} + +/* + * This function counts and verifies the the number of attached + * slaves, checking the count against the expected value (given that incr + * is either 1 or -1, for add or removal of a slave). Only + * bond_xmit_xor() uses the slave_cnt value, but this is still a good + * consistency check. + */ +static inline void +update_slave_cnt(bonding_t *bond, int incr) +{ + slave_t *slave = NULL; + int expect = bond->slave_cnt + incr; + + bond->slave_cnt = 0; + for (slave = bond->prev; slave != (slave_t*)bond; + slave = slave->prev) { + bond->slave_cnt++; + } + + if (expect != bond->slave_cnt) + BUG(); +} + +/* + * This function detaches the slave from the list . + * WARNING: no check is made to verify if the slave effectively + * belongs to . It returns in case it's needed. + * Nothing is freed on return, structures are just unchained. + * If the bond->current_slave pointer was pointing to , + * it's replaced with slave->next, or if not applicable. + * + * bond->lock held by caller. + */ +static slave_t * +bond_detach_slave(bonding_t *bond, slave_t *slave) +{ + if ((bond == NULL) || (slave == NULL) || + ((void *)bond == (void *)slave)) { + printk(KERN_ERR + "bond_detach_slave(): trying to detach " + "slave %p from bond %p\n", bond, slave); + return slave; + } + + if (bond->next == slave) { /* is the slave at the head ? */ + if (bond->prev == slave) { /* is the slave alone ? */ + write_lock(&bond->ptrlock); + bond->current_slave = NULL; /* no slave anymore */ + write_unlock(&bond->ptrlock); + bond->prev = bond->next = (slave_t *)bond; + } else { /* not alone */ + bond->next = slave->next; + slave->next->prev = (slave_t *)bond; + bond->prev->next = slave->next; + + write_lock(&bond->ptrlock); + if (bond->current_slave == slave) { + bond->current_slave = slave->next; + } + write_unlock(&bond->ptrlock); + } + } else { + slave->prev->next = slave->next; + if (bond->prev == slave) { /* is this slave the last one ? */ + bond->prev = slave->prev; + } else { + slave->next->prev = slave->prev; + } + + write_lock(&bond->ptrlock); + if (bond->current_slave == slave) { + bond->current_slave = slave->next; + } + write_unlock(&bond->ptrlock); + } + + update_slave_cnt(bond, -1); + + return slave; +} + +static void +bond_attach_slave(struct bonding *bond, struct slave *new_slave) +{ + /* + * queue to the end of the slaves list, make the first element its + * successor, the last one its predecessor, and make it the bond's + * predecessor. + * + * Just to clarify, so future bonding driver hackers don't go through + * the same confusion stage I did trying to figure this out, the + * slaves are stored in a double linked circular list, sortof. + * In the ->next direction, the last slave points to the first slave, + * bypassing bond; only the slaves are in the ->next direction. + * In the ->prev direction, however, the first slave points to bond + * and bond points to the last slave. + * + * It looks like a circle with a little bubble hanging off one side + * in the ->prev direction only. + * + * When going through the list once, its best to start at bond->prev + * and go in the ->prev direction, testing for bond. Doing this + * in the ->next direction doesn't work. Trust me, I know this now. + * :) -mts 2002.03.14 + */ + new_slave->prev = bond->prev; + new_slave->prev->next = new_slave; + bond->prev = new_slave; + new_slave->next = bond->next; + + update_slave_cnt(bond, 1); +} + + +/* + * Less bad way to call ioctl from within the kernel; this needs to be + * done some other way to get the call out of interrupt context. + * Needs "ioctl" variable to be supplied by calling context. + */ +#define IOCTL(dev, arg, cmd) ({ \ + int ret; \ + mm_segment_t fs = get_fs(); \ + set_fs(get_ds()); \ + ret = ioctl(dev, arg, cmd); \ + set_fs(fs); \ + ret; }) + +/* + * Get link speed and duplex from the slave's base driver + * using ethtool. If for some reason the call fails or the + * values are invalid, fake speed and duplex to 100/Full + * and return error. + */ +static int bond_update_speed_duplex(struct slave *slave) +{ + struct net_device *dev = slave->dev; + static int (* ioctl)(struct net_device *, struct ifreq *, int); + struct ifreq ifr; + struct ethtool_cmd etool; + + ioctl = dev->do_ioctl; + if (ioctl) { + etool.cmd = ETHTOOL_GSET; + ifr.ifr_data = (char*)&etool; + if (IOCTL(dev, &ifr, SIOCETHTOOL) == 0) { + slave->speed = etool.speed; + slave->duplex = etool.duplex; + } else { + goto err_out; + } + } else { + goto err_out; + } + + switch (slave->speed) { + case SPEED_10: + case SPEED_100: + case SPEED_1000: + break; + default: + goto err_out; + } + + switch (slave->duplex) { + case DUPLEX_FULL: + case DUPLEX_HALF: + break; + default: + goto err_out; + } + + return 0; + +err_out: + //Fake speed and duplex + slave->speed = SPEED_100; + slave->duplex = DUPLEX_FULL; + return -1; +} + +/* + * if supports MII link status reporting, check its link status. + * + * We either do MII/ETHTOOL ioctls, or check netif_carrier_ok(), + * depening upon the setting of the use_carrier parameter. + * + * Return either BMSR_LSTATUS, meaning that the link is up (or we + * can't tell and just pretend it is), or 0, meaning that the link is + * down. + * + * If reporting is non-zero, instead of faking link up, return -1 if + * both ETHTOOL and MII ioctls fail (meaning the device does not + * support them). If use_carrier is set, return whatever it says. + * It'd be nice if there was a good way to tell if a driver supports + * netif_carrier, but there really isn't. + */ +static int +bond_check_dev_link(struct net_device *dev, int reporting) +{ + static int (* ioctl)(struct net_device *, struct ifreq *, int); + struct ifreq ifr; + struct mii_ioctl_data *mii; + struct ethtool_value etool; + + if (use_carrier) { + return netif_carrier_ok(dev) ? BMSR_LSTATUS : 0; + } + + ioctl = dev->do_ioctl; + if (ioctl) { + /* TODO: set pointer to correct ioctl on a per team member */ + /* bases to make this more efficient. that is, once */ + /* we determine the correct ioctl, we will always */ + /* call it and not the others for that team */ + /* member. */ + + /* + * We cannot assume that SIOCGMIIPHY will also read a + * register; not all network drivers (e.g., e100) + * support that. + */ + + /* Yes, the mii is overlaid on the ifreq.ifr_ifru */ + mii = (struct mii_ioctl_data *)&ifr.ifr_data; + if (IOCTL(dev, &ifr, SIOCGMIIPHY) == 0) { + mii->reg_num = MII_BMSR; + if (IOCTL(dev, &ifr, SIOCGMIIREG) == 0) { + return mii->val_out & BMSR_LSTATUS; + } + } + + /* try SIOCETHTOOL ioctl, some drivers cache ETHTOOL_GLINK */ + /* for a period of time so we attempt to get link status */ + /* from it last if the above MII ioctls fail... */ + etool.cmd = ETHTOOL_GLINK; + ifr.ifr_data = (char*)&etool; + if (IOCTL(dev, &ifr, SIOCETHTOOL) == 0) { + if (etool.data == 1) { + return BMSR_LSTATUS; + } else { +#ifdef BONDING_DEBUG + printk(KERN_INFO + ":: SIOCETHTOOL shows link down \n"); +#endif + return 0; + } + } + + } + + /* + * If reporting, report that either there's no dev->do_ioctl, + * or both SIOCGMIIREG and SIOCETHTOOL failed (meaning that we + * cannot report link status). If not reporting, pretend + * we're ok. + */ + return reporting ? -1 : BMSR_LSTATUS; +} + +static u16 bond_check_mii_link(bonding_t *bond) +{ + int has_active_interface = 0; + unsigned long flags; + + read_lock_irqsave(&bond->lock, flags); + read_lock(&bond->ptrlock); + has_active_interface = (bond->current_slave != NULL); + read_unlock(&bond->ptrlock); + read_unlock_irqrestore(&bond->lock, flags); + + return (has_active_interface ? BMSR_LSTATUS : 0); +} + +static int bond_open(struct net_device *dev) +{ + struct timer_list *timer = &((struct bonding *)(dev->priv))->mii_timer; + struct timer_list *arp_timer = &((struct bonding *)(dev->priv))->arp_timer; + MOD_INC_USE_COUNT; + + if (miimon > 0) { /* link check interval, in milliseconds. */ + init_timer(timer); + timer->expires = jiffies + (miimon * HZ / 1000); + timer->data = (unsigned long)dev; + timer->function = (void *)&bond_mii_monitor; + add_timer(timer); + } + + if (arp_interval> 0) { /* arp interval, in milliseconds. */ + init_timer(arp_timer); + arp_timer->expires = jiffies + (arp_interval * HZ / 1000); + arp_timer->data = (unsigned long)dev; + if (bond_mode == BOND_MODE_ACTIVEBACKUP) { + arp_timer->function = (void *)&activebackup_arp_monitor; + } else { + arp_timer->function = (void *)&loadbalance_arp_monitor; + } + add_timer(arp_timer); + } + return 0; +} + +static int bond_close(struct net_device *master) +{ + bonding_t *bond = (struct bonding *) master->priv; + unsigned long flags; + + write_lock_irqsave(&bond->lock, flags); + + if (miimon > 0) { /* link check interval, in milliseconds. */ + del_timer(&bond->mii_timer); + } + if (arp_interval> 0) { /* arp interval, in milliseconds. */ + del_timer(&bond->arp_timer); + if (arp_target_hw_addr != NULL) { + kfree(arp_target_hw_addr); + arp_target_hw_addr = NULL; + } + } + + /* Release the bonded slaves */ + bond_release_all(master); + bond_mc_list_destroy (bond); + + write_unlock_irqrestore(&bond->lock, flags); + + MOD_DEC_USE_COUNT; + return 0; +} + +/* + * flush all members of flush->mc_list from device dev->mc_list + */ +static void bond_mc_list_flush(struct net_device *dev, struct net_device *flush) +{ + struct dev_mc_list *dmi; + + for (dmi = flush->mc_list; dmi != NULL; dmi = dmi->next) + dev_mc_delete(dev, dmi->dmi_addr, dmi->dmi_addrlen, 0); +} + +/* + * Totally destroys the mc_list in bond + */ +static void bond_mc_list_destroy(struct bonding *bond) +{ + struct dev_mc_list *dmi; + + dmi = bond->mc_list; + while (dmi) { + bond->mc_list = dmi->next; + kfree(dmi); + dmi = bond->mc_list; + } +} + +/* + * Add a Multicast address to every slave in the bonding group + */ +static void bond_mc_add(bonding_t *bond, void *addr, int alen) +{ + slave_t *slave; + switch (multicast_mode) { + case BOND_MULTICAST_ACTIVE : + /* write lock already acquired */ + if (bond->current_slave != NULL) + dev_mc_add(bond->current_slave->dev, addr, alen, 0); + break; + case BOND_MULTICAST_ALL : + for (slave = bond->prev; slave != (slave_t*)bond; slave = slave->prev) + dev_mc_add(slave->dev, addr, alen, 0); + break; + case BOND_MULTICAST_DISABLED : + break; + } +} + +/* + * Remove a multicast address from every slave in the bonding group + */ +static void bond_mc_delete(bonding_t *bond, void *addr, int alen) +{ + slave_t *slave; + switch (multicast_mode) { + case BOND_MULTICAST_ACTIVE : + /* write lock already acquired */ + if (bond->current_slave != NULL) + dev_mc_delete(bond->current_slave->dev, addr, alen, 0); + break; + case BOND_MULTICAST_ALL : + for (slave = bond->prev; slave != (slave_t*)bond; slave = slave->prev) + dev_mc_delete(slave->dev, addr, alen, 0); + break; + case BOND_MULTICAST_DISABLED : + break; + } +} + +/* + * Copy all the Multicast addresses from src to the bonding device dst + */ +static int bond_mc_list_copy (struct dev_mc_list *src, struct bonding *dst, + int gpf_flag) +{ + struct dev_mc_list *dmi, *new_dmi; + + for (dmi = src; dmi != NULL; dmi = dmi->next) { + new_dmi = kmalloc(sizeof(struct dev_mc_list), gpf_flag); + + if (new_dmi == NULL) { + return -ENOMEM; + } + + new_dmi->next = dst->mc_list; + dst->mc_list = new_dmi; + + new_dmi->dmi_addrlen = dmi->dmi_addrlen; + memcpy(new_dmi->dmi_addr, dmi->dmi_addr, dmi->dmi_addrlen); + new_dmi->dmi_users = dmi->dmi_users; + new_dmi->dmi_gusers = dmi->dmi_gusers; + } + return 0; +} + +/* + * Returns 0 if dmi1 and dmi2 are the same, non-0 otherwise + */ +static inline int dmi_same(struct dev_mc_list *dmi1, struct dev_mc_list *dmi2) +{ + return memcmp(dmi1->dmi_addr, dmi2->dmi_addr, dmi1->dmi_addrlen) == 0 && + dmi1->dmi_addrlen == dmi2->dmi_addrlen; +} + +/* + * Push the promiscuity flag down to all slaves + */ +static void bond_set_promiscuity(bonding_t *bond, int inc) +{ + slave_t *slave; + switch (multicast_mode) { + case BOND_MULTICAST_ACTIVE : + /* write lock already acquired */ + if (bond->current_slave != NULL) + dev_set_promiscuity(bond->current_slave->dev, inc); + break; + case BOND_MULTICAST_ALL : + for (slave = bond->prev; slave != (slave_t*)bond; slave = slave->prev) + dev_set_promiscuity(slave->dev, inc); + break; + case BOND_MULTICAST_DISABLED : + break; + } +} + +/* + * Push the allmulti flag down to all slaves + */ +static void bond_set_allmulti(bonding_t *bond, int inc) +{ + slave_t *slave; + switch (multicast_mode) { + case BOND_MULTICAST_ACTIVE : + /* write lock already acquired */ + if (bond->current_slave != NULL) + dev_set_allmulti(bond->current_slave->dev, inc); + break; + case BOND_MULTICAST_ALL : + for (slave = bond->prev; slave != (slave_t*)bond; slave = slave->prev) + dev_set_allmulti(slave->dev, inc); + break; + case BOND_MULTICAST_DISABLED : + break; + } +} + +/* + * returns dmi entry if found, NULL otherwise + */ +static struct dev_mc_list* bond_mc_list_find_dmi(struct dev_mc_list *dmi, + struct dev_mc_list *mc_list) +{ + struct dev_mc_list *idmi; + + for (idmi = mc_list; idmi != NULL; idmi = idmi->next) { + if (dmi_same(dmi, idmi)) { + return idmi; + } + } + return NULL; +} + +static void set_multicast_list(struct net_device *master) +{ + bonding_t *bond = master->priv; + struct dev_mc_list *dmi; + unsigned long flags = 0; + + if (multicast_mode == BOND_MULTICAST_DISABLED) + return; + /* + * Lock the private data for the master + */ + write_lock_irqsave(&bond->lock, flags); + + /* set promiscuity flag to slaves */ + if ( (master->flags & IFF_PROMISC) && !(bond->flags & IFF_PROMISC) ) + bond_set_promiscuity(bond, 1); + + if ( !(master->flags & IFF_PROMISC) && (bond->flags & IFF_PROMISC) ) + bond_set_promiscuity(bond, -1); + + /* set allmulti flag to slaves */ + if ( (master->flags & IFF_ALLMULTI) && !(bond->flags & IFF_ALLMULTI) ) + bond_set_allmulti(bond, 1); + + if ( !(master->flags & IFF_ALLMULTI) && (bond->flags & IFF_ALLMULTI) ) + bond_set_allmulti(bond, -1); + + bond->flags = master->flags; + + /* looking for addresses to add to slaves' mc list */ + for (dmi = master->mc_list; dmi != NULL; dmi = dmi->next) { + if (bond_mc_list_find_dmi(dmi, bond->mc_list) == NULL) + bond_mc_add(bond, dmi->dmi_addr, dmi->dmi_addrlen); + } + + /* looking for addresses to delete from slaves' list */ + for (dmi = bond->mc_list; dmi != NULL; dmi = dmi->next) { + if (bond_mc_list_find_dmi(dmi, master->mc_list) == NULL) + bond_mc_delete(bond, dmi->dmi_addr, dmi->dmi_addrlen); + } + + + /* save master's multicast list */ + bond_mc_list_destroy (bond); + bond_mc_list_copy (master->mc_list, bond, GFP_ATOMIC); + + write_unlock_irqrestore(&bond->lock, flags); +} + +/* + * Update the mc list and multicast-related flags for the new and + * old active slaves (if any) according to the multicast mode + */ +static void bond_mc_update(bonding_t *bond, slave_t *new, slave_t *old) +{ + struct dev_mc_list *dmi; + + switch(multicast_mode) { + case BOND_MULTICAST_ACTIVE : + if (bond->device->flags & IFF_PROMISC) { + if (old != NULL && new != old) + dev_set_promiscuity(old->dev, -1); + dev_set_promiscuity(new->dev, 1); + } + if (bond->device->flags & IFF_ALLMULTI) { + if (old != NULL && new != old) + dev_set_allmulti(old->dev, -1); + dev_set_allmulti(new->dev, 1); + } + /* first remove all mc addresses from old slave if any, + and _then_ add them to new active slave */ + if (old != NULL && new != old) { + for (dmi = bond->device->mc_list; dmi != NULL; dmi = dmi->next) + dev_mc_delete(old->dev, dmi->dmi_addr, dmi->dmi_addrlen, 0); + } + for (dmi = bond->device->mc_list; dmi != NULL; dmi = dmi->next) + dev_mc_add(new->dev, dmi->dmi_addr, dmi->dmi_addrlen, 0); + break; + case BOND_MULTICAST_ALL : + /* nothing to do: mc list is already up-to-date on all slaves */ + break; + case BOND_MULTICAST_DISABLED : + break; + } +} + +/* enslave device to bond device */ +static int bond_enslave(struct net_device *master_dev, + struct net_device *slave_dev) +{ + bonding_t *bond = NULL; + slave_t *new_slave = NULL; + unsigned long flags = 0; + unsigned long rflags = 0; + int err = 0; + struct dev_mc_list *dmi; + struct in_ifaddr **ifap; + struct in_ifaddr *ifa; + int link_reporting; + struct sockaddr addr; + + if (master_dev == NULL || slave_dev == NULL) { + return -ENODEV; + } + bond = (struct bonding *) master_dev->priv; + + if (slave_dev->do_ioctl == NULL) { + printk(KERN_DEBUG + "Warning : no link monitoring support for %s\n", + slave_dev->name); + } + + /* This breaks backward comaptibility with old versions + of ifenslave which open the slave before enalsving */ + /* already up. */ + if ((slave_dev->flags & IFF_UP) == IFF_UP) { +#ifdef BONDING_DEBUG + printk(KERN_CRIT "Error, slave_dev is up\n"); +#endif + return -EBUSY; + } + + /* already enslaved */ + if (master_dev->flags & IFF_SLAVE || slave_dev->flags & IFF_SLAVE) { +#ifdef BONDING_DEBUG + printk(KERN_CRIT "Error, Device was already enslaved\n"); +#endif + return -EBUSY; + } + + /* bond must be initialize by bond_open() before enslaving */ + if ((master_dev->flags & IFF_UP) != IFF_UP) { +#ifdef BONDING_DEBUG + printk(KERN_CRIT "Error, master_dev is not up\n"); +#endif + return -EPERM; + } + + if (slave_dev->set_mac_address == NULL) { + printk(KERN_CRIT " The slave device you specified does not support" + " setting the MAC address.\n Your kernel likely does not" + " support slave devices.\n"); + return -EOPNOTSUPP; + } + + if ((new_slave = kmalloc(sizeof(slave_t), GFP_ATOMIC)) == NULL) { + return -ENOMEM; + } + memset(new_slave, 0, sizeof(slave_t)); + + /* save slave's original flags before calling */ + /* netdev_set_master and dev_open */ + new_slave->original_flags = slave_dev->flags; + + /* save slave's original ("permanent") mac address for + modes that needs it, and for restoring it upon release, + and then set it to the master's address */ + memcpy(new_slave->perm_hwaddr, slave_dev->dev_addr, ETH_ALEN); + + if (bond->next != (slave_t*)bond) { + /* set slave to master's mac address + The application already set the master's + mac address to that of the first slave */ + memcpy(addr.sa_data, master_dev->dev_addr, ETH_ALEN); + addr.sa_family = slave_dev->type; + err = slave_dev->set_mac_address(slave_dev, &addr); + if (err) { +#ifdef BONDING_DEBUG + printk(KERN_CRIT "Error %d calling set_mac_address\n", err); +#endif + goto err_free; + } + } + + /* open the slave since the application closed it */ + err = dev_open(slave_dev); + if (err) { +#ifdef BONDING_DEBUG + printk(KERN_CRIT "Openning slave %s failed\n", slave_dev->name); +#endif + goto err_restore_mac; + } + + err = netdev_set_master(slave_dev, master_dev); + + if (err) { +#ifdef BONDING_DEBUG + printk(KERN_CRIT "Error %d calling netdev_set_master\n", err); +#endif + goto err_close; + } + + new_slave->dev = slave_dev; + + if (multicast_mode == BOND_MULTICAST_ALL) { + /* set promiscuity level to new slave */ + if (master_dev->flags & IFF_PROMISC) + dev_set_promiscuity(slave_dev, 1); + + /* set allmulti level to new slave */ + if (master_dev->flags & IFF_ALLMULTI) + dev_set_allmulti(slave_dev, 1); + + /* upload master's mc_list to new slave */ + for (dmi = master_dev->mc_list; dmi != NULL; dmi = dmi->next) + dev_mc_add (slave_dev, dmi->dmi_addr, dmi->dmi_addrlen, 0); + } + + write_lock_irqsave(&bond->lock, flags); + + bond_attach_slave(bond, new_slave); + new_slave->delay = 0; + new_slave->link_failure_count = 0; + + if (miimon > 0 && !use_carrier) { + link_reporting = bond_check_dev_link(slave_dev, 1); + + if ((link_reporting == -1) && (arp_interval == 0)) { + /* + * miimon is set but a bonded network driver + * does not support ETHTOOL/MII and + * arp_interval is not set. Note: if + * use_carrier is enabled, we will never go + * here (because netif_carrier is always + * supported); thus, we don't need to change + * the messages for netif_carrier. + */ + printk(KERN_ERR + "bond_enslave(): MII and ETHTOOL support not " + "available for interface %s, and " + "arp_interval/arp_ip_target module parameters " + "not specified, thus bonding will not detect " + "link failures! see bonding.txt for details.\n", + slave_dev->name); + } else if (link_reporting == -1) { + /* unable get link status using mii/ethtool */ + printk(KERN_WARNING + "bond_enslave: can't get link status from " + "interface %s; the network driver associated " + "with this interface does not support " + "MII or ETHTOOL link status reporting, thus " + "miimon has no effect on this interface.\n", + slave_dev->name); + } + } + + /* check for initial state */ + if ((miimon <= 0) || + (bond_check_dev_link(slave_dev, 0) == BMSR_LSTATUS)) { +#ifdef BONDING_DEBUG + printk(KERN_CRIT "Initial state of slave_dev is BOND_LINK_UP\n"); +#endif + new_slave->link = BOND_LINK_UP; + new_slave->jiffies = jiffies; + } + else { +#ifdef BONDING_DEBUG + printk(KERN_CRIT "Initial state of slave_dev is BOND_LINK_DOWN\n"); +#endif + new_slave->link = BOND_LINK_DOWN; + } + + if (bond_update_speed_duplex(new_slave) && (new_slave->link == BOND_LINK_UP) ) { + printk(KERN_WARNING + "bond_enslave(): failed to get speed/duplex from %s, " + "speed forced to 100Mbps, duplex forced to Full.\n", + new_slave->dev->name); + } + + /* if we're in active-backup mode, we need one and only one active + * interface. The backup interfaces will have their NOARP flag set + * because we need them to be completely deaf and not to respond to + * any ARP request on the network to avoid fooling a switch. Thus, + * since we guarantee that current_slave always point to the last + * usable interface, we just have to verify this interface's flag. + */ + if (bond_mode == BOND_MODE_ACTIVEBACKUP) { + if (((bond->current_slave == NULL) + || (bond->current_slave->dev->flags & IFF_NOARP)) + && (new_slave->link == BOND_LINK_UP)) { +#ifdef BONDING_DEBUG + printk(KERN_CRIT "This is the first active slave\n"); +#endif + /* first slave or no active slave yet, and this link + is OK, so make this interface the active one */ + bond->current_slave = new_slave; + bond_set_slave_active_flags(new_slave); + bond_mc_update(bond, new_slave, NULL); + } + else { +#ifdef BONDING_DEBUG + printk(KERN_CRIT "This is just a backup slave\n"); +#endif + bond_set_slave_inactive_flags(new_slave); + } + read_lock_irqsave(&(((struct in_device *)slave_dev->ip_ptr)->lock), rflags); + ifap= &(((struct in_device *)slave_dev->ip_ptr)->ifa_list); + ifa = *ifap; + my_ip = ifa->ifa_address; + read_unlock_irqrestore(&(((struct in_device *)slave_dev->ip_ptr)->lock), rflags); + + /* if there is a primary slave, remember it */ + if (primary != NULL) + if( strcmp(primary, new_slave->dev->name) == 0) + bond->primary_slave = new_slave; + } else { +#ifdef BONDING_DEBUG + printk(KERN_CRIT "This slave is always active in trunk mode\n"); +#endif + /* always active in trunk mode */ + new_slave->state = BOND_STATE_ACTIVE; + if (bond->current_slave == NULL) + bond->current_slave = new_slave; + } + + write_unlock_irqrestore(&bond->lock, flags); + + printk (KERN_INFO "%s: enslaving %s as a%s interface with a%s link.\n", + master_dev->name, slave_dev->name, + new_slave->state == BOND_STATE_ACTIVE ? "n active" : " backup", + new_slave->link == BOND_LINK_UP ? "n up" : " down"); + + //enslave is successfull + return 0; + +// Undo stages on error +err_close: + dev_close(slave_dev); + +err_restore_mac: + memcpy(addr.sa_data, new_slave->perm_hwaddr, ETH_ALEN); + addr.sa_family = slave_dev->type; + slave_dev->set_mac_address(slave_dev, &addr); + +err_free: + kfree(new_slave); + return err; +} + +/* + * This function changes the active slave to slave . + * It returns -EINVAL in the following cases. + * - is not found in the list. + * - There is not active slave now. + * - is already active. + * - The link state of is not BOND_LINK_UP. + * - is not running. + * In these cases, this fuction does nothing. + * In the other cases, currnt_slave pointer is changed and 0 is returned. + */ +static int bond_change_active(struct net_device *master_dev, struct net_device *slave_dev) +{ + bonding_t *bond; + slave_t *slave; + slave_t *oldactive = NULL; + slave_t *newactive = NULL; + unsigned long flags; + int ret = 0; + + if (master_dev == NULL || slave_dev == NULL) { + return -ENODEV; + } + + bond = (struct bonding *) master_dev->priv; + write_lock_irqsave(&bond->lock, flags); + slave = (slave_t *)bond; + oldactive = bond->current_slave; + + while ((slave = slave->prev) != (slave_t *)bond) { + if(slave_dev == slave->dev) { + newactive = slave; + break; + } + } + + if ((newactive != NULL)&& + (oldactive != NULL)&& + (newactive != oldactive)&& + (newactive->link == BOND_LINK_UP)&& + IS_UP(newactive->dev)) { + bond_set_slave_inactive_flags(oldactive); + bond_set_slave_active_flags(newactive); + bond_mc_update(bond, newactive, oldactive); + bond->current_slave = newactive; + printk("%s : activate %s(old : %s)\n", + master_dev->name, newactive->dev->name, + oldactive->dev->name); + } + else { + ret = -EINVAL; + } + write_unlock_irqrestore(&bond->lock, flags); + return ret; +} + +/* Choose a new valid interface from the pool, set it active + * and make it the current slave. If no valid interface is + * found, the oldest slave in BACK state is choosen and + * activated. If none is found, it's considered as no + * interfaces left so the current slave is set to NULL. + * The result is a pointer to the current slave. + * + * Since this function sends messages tails through printk, the caller + * must have started something like `printk(KERN_INFO "xxxx ");'. + * + * Warning: must put locks around the call to this function if needed. + */ +slave_t *change_active_interface(bonding_t *bond) +{ + slave_t *newslave, *oldslave; + slave_t *bestslave = NULL; + int mintime; + + read_lock(&bond->ptrlock); + newslave = oldslave = bond->current_slave; + read_unlock(&bond->ptrlock); + + if (newslave == NULL) { /* there were no active slaves left */ + if (bond->next != (slave_t *)bond) { /* found one slave */ + write_lock(&bond->ptrlock); + newslave = bond->current_slave = bond->next; + write_unlock(&bond->ptrlock); + } else { + + printk (" but could not find any %s interface.\n", + (bond_mode == BOND_MODE_ACTIVEBACKUP) ? "backup":"other"); + write_lock(&bond->ptrlock); + bond->current_slave = (slave_t *)NULL; + write_unlock(&bond->ptrlock); + return NULL; /* still no slave, return NULL */ + } + } else if (bond_mode == BOND_MODE_ACTIVEBACKUP) { + /* make sure oldslave doesn't send arps - this could + * cause a ping-pong effect between interfaces since they + * would be able to tx arps - in active backup only one + * slave should be able to tx arps, and that should be + * the current_slave; the only exception is when all + * slaves have gone down, then only one non-current slave can + * send arps at a time; clearing oldslaves' mc list is handled + * later in this function. + */ + bond_set_slave_inactive_flags(oldslave); + } + + mintime = updelay; + + /* first try the primary link; if arping, a link must tx/rx traffic + * before it can be considered the current_slave - also, we would skip + * slaves between the current_slave and primary_slave that may be up + * and able to arp + */ + if ((bond->primary_slave != NULL) && (arp_interval == 0)) { + if (IS_UP(bond->primary_slave->dev)) + newslave = bond->primary_slave; + } + + do { + if (IS_UP(newslave->dev)) { + if (newslave->link == BOND_LINK_UP) { + /* this one is immediately usable */ + if (bond_mode == BOND_MODE_ACTIVEBACKUP) { + bond_set_slave_active_flags(newslave); + bond_mc_update(bond, newslave, oldslave); + printk (" and making interface %s the active one.\n", + newslave->dev->name); + } + else { + printk (" and setting pointer to interface %s.\n", + newslave->dev->name); + } + + write_lock(&bond->ptrlock); + bond->current_slave = newslave; + write_unlock(&bond->ptrlock); + return newslave; + } + else if (newslave->link == BOND_LINK_BACK) { + /* link up, but waiting for stabilization */ + if (newslave->delay < mintime) { + mintime = newslave->delay; + bestslave = newslave; + } + } + } + } while ((newslave = newslave->next) != oldslave); + + /* no usable backup found, we'll see if we at least got a link that was + coming back for a long time, and could possibly already be usable. + */ + + if (bestslave != NULL) { + /* early take-over. */ + printk (" and making interface %s the active one %d ms earlier.\n", + bestslave->dev->name, + (updelay - bestslave->delay)*miimon); + + bestslave->delay = 0; + bestslave->link = BOND_LINK_UP; + bestslave->jiffies = jiffies; + bond_set_slave_active_flags(bestslave); + bond_mc_update(bond, bestslave, oldslave); + write_lock(&bond->ptrlock); + bond->current_slave = bestslave; + write_unlock(&bond->ptrlock); + return bestslave; + } + + if ((bond_mode == BOND_MODE_ACTIVEBACKUP) && + (multicast_mode == BOND_MULTICAST_ACTIVE) && + (oldslave != NULL)) { + /* flush bonds (master's) mc_list from oldslave since it wasn't + * updated (and deleted) above + */ + bond_mc_list_flush(oldslave->dev, bond->device); + if (bond->device->flags & IFF_PROMISC) { + dev_set_promiscuity(oldslave->dev, -1); + } + if (bond->device->flags & IFF_ALLMULTI) { + dev_set_allmulti(oldslave->dev, -1); + } + } + + printk (" but could not find any %s interface.\n", + (bond_mode == BOND_MODE_ACTIVEBACKUP) ? "backup":"other"); + + /* absolutely nothing found. let's return NULL */ + write_lock(&bond->ptrlock); + bond->current_slave = (slave_t *)NULL; + write_unlock(&bond->ptrlock); + return NULL; +} + +/* + * Try to release the slave device from the bond device + * It is legal to access current_slave without a lock because all the function + * is write-locked. + * + * The rules for slave state should be: + * for Active/Backup: + * Active stays on all backups go down + * for Bonded connections: + * The first up interface should be left on and all others downed. + */ +static int bond_release(struct net_device *master, struct net_device *slave) +{ + bonding_t *bond; + slave_t *our_slave, *old_current; + unsigned long flags; + struct sockaddr addr; + + if (master == NULL || slave == NULL) { + return -ENODEV; + } + + bond = (struct bonding *) master->priv; + + /* master already enslaved, or slave not enslaved, + or no slave for this master */ + if ((master->flags & IFF_SLAVE) || !(slave->flags & IFF_SLAVE)) { + printk (KERN_DEBUG "%s: cannot release %s.\n", master->name, slave->name); + return -EINVAL; + } + + write_lock_irqsave(&bond->lock, flags); + bond->current_arp_slave = NULL; + our_slave = (slave_t *)bond; + old_current = bond->current_slave; + while ((our_slave = our_slave->prev) != (slave_t *)bond) { + if (our_slave->dev == slave) { + bond_detach_slave(bond, our_slave); + + printk (KERN_INFO "%s: releasing %s interface %s", + master->name, + (our_slave->state == BOND_STATE_ACTIVE) ? "active" : "backup", + slave->name); + + if (our_slave == old_current) { + /* find a new interface and be verbose */ + change_active_interface(bond); + } else { + printk(".\n"); + } + + if (bond->current_slave == NULL) { + printk(KERN_INFO + "%s: now running without any active interface !\n", + master->name); + } + + if (bond->primary_slave == our_slave) { + bond->primary_slave = NULL; + } + + break; + } + + } + write_unlock_irqrestore(&bond->lock, flags); + + if (our_slave == (slave_t *)bond) { + /* if we get here, it's because the device was not found */ + printk (KERN_INFO "%s: %s not enslaved\n", master->name, slave->name); + return -EINVAL; + } + + /* undo settings and restore original values */ + + if (multicast_mode == BOND_MULTICAST_ALL) { + /* flush master's mc_list from slave */ + bond_mc_list_flush (slave, master); + + /* unset promiscuity level from slave */ + if (master->flags & IFF_PROMISC) + dev_set_promiscuity(slave, -1); + + /* unset allmulti level from slave */ + if (master->flags & IFF_ALLMULTI) + dev_set_allmulti(slave, -1); + } + + netdev_set_master(slave, NULL); + + /* close slave before restoring its mac address */ + dev_close(slave); + + /* restore original ("permanent") mac address*/ + memcpy(addr.sa_data, our_slave->perm_hwaddr, ETH_ALEN); + addr.sa_family = slave->type; + slave->set_mac_address(slave, &addr); + + /* restore the original state of the IFF_NOARP flag that might have */ + /* been set by bond_set_slave_inactive_flags() */ + if ((our_slave->original_flags & IFF_NOARP) == 0) { + slave->flags &= ~IFF_NOARP; + } + + kfree(our_slave); + + /* if the last slave was removed, zero the mac address + of the master so it will be set by the application + to the mac address of the first slave */ + if (bond->next == (slave_t*)bond) { + memset(master->dev_addr, 0, master->addr_len); + } + + return 0; /* deletion OK */ +} + +/* + * This function releases all slaves. + * Warning: must put write-locks around the call to this function. + */ +static int bond_release_all(struct net_device *master) +{ + bonding_t *bond; + slave_t *our_slave; + struct net_device *slave_dev; + struct sockaddr addr; + + if (master == NULL) { + return -ENODEV; + } + + if (master->flags & IFF_SLAVE) { + return -EINVAL; + } + + bond = (struct bonding *) master->priv; + bond->current_arp_slave = NULL; + bond->current_slave = NULL; + bond->primary_slave = NULL; + + while ((our_slave = bond->prev) != (slave_t *)bond) { + slave_dev = our_slave->dev; + bond_detach_slave(bond, our_slave); + + if (multicast_mode == BOND_MULTICAST_ALL + || (multicast_mode == BOND_MULTICAST_ACTIVE + && bond->current_slave == our_slave)) { + + /* flush master's mc_list from slave */ + bond_mc_list_flush (slave_dev, master); + + /* unset promiscuity level from slave */ + if (master->flags & IFF_PROMISC) + dev_set_promiscuity(slave_dev, -1); + + /* unset allmulti level from slave */ + if (master->flags & IFF_ALLMULTI) + dev_set_allmulti(slave_dev, -1); + } + + /* Can be safely called from inside the bond lock + since traffic and timers have already stopped + */ + netdev_set_master(slave_dev, NULL); + + /* close slave before restoring its mac address */ + dev_close(slave_dev); + + /* restore original ("permanent") mac address*/ + memcpy(addr.sa_data, our_slave->perm_hwaddr, ETH_ALEN); + addr.sa_family = slave_dev->type; + slave_dev->set_mac_address(slave_dev, &addr); + + /* restore the original state of the IFF_NOARP flag that might have */ + /* been set by bond_set_slave_inactive_flags() */ + if ((our_slave->original_flags & IFF_NOARP) == 0) { + slave_dev->flags &= ~IFF_NOARP; + } + + kfree(our_slave); + } + + /* zero the mac address of the master so it will be + set by the application to the mac address of the + first slave */ + memset(master->dev_addr, 0, master->addr_len); + + printk (KERN_INFO "%s: released all slaves\n", master->name); + + return 0; +} + +/* this function is called regularly to monitor each slave's link. */ +static void bond_mii_monitor(struct net_device *master) +{ + bonding_t *bond = (struct bonding *) master->priv; + slave_t *slave, *bestslave, *oldcurrent; + unsigned long flags; + int slave_died = 0; + + read_lock_irqsave(&bond->lock, flags); + + /* we will try to read the link status of each of our slaves, and + * set their IFF_RUNNING flag appropriately. For each slave not + * supporting MII status, we won't do anything so that a user-space + * program could monitor the link itself if needed. + */ + + bestslave = NULL; + slave = (slave_t *)bond; + + read_lock(&bond->ptrlock); + oldcurrent = bond->current_slave; + read_unlock(&bond->ptrlock); + + while ((slave = slave->prev) != (slave_t *)bond) { + /* use updelay+1 to match an UP slave even when updelay is 0 */ + int mindelay = updelay + 1; + struct net_device *dev = slave->dev; + int link_state; + + link_state = bond_check_dev_link(dev, 0); + + switch (slave->link) { + case BOND_LINK_UP: /* the link was up */ + if (link_state == BMSR_LSTATUS) { + /* link stays up, tell that this one + is immediately available */ + if (IS_UP(dev) && (mindelay > -2)) { + /* -2 is the best case : + this slave was already up */ + mindelay = -2; + bestslave = slave; + } + break; + } + else { /* link going down */ + slave->link = BOND_LINK_FAIL; + slave->delay = downdelay; + if (slave->link_failure_count < UINT_MAX) { + slave->link_failure_count++; + } + if (downdelay > 0) { + printk (KERN_INFO + "%s: link status down for %sinterface " + "%s, disabling it in %d ms.\n", + master->name, + IS_UP(dev) + ? ((bond_mode == BOND_MODE_ACTIVEBACKUP) + ? ((slave == oldcurrent) + ? "active " : "backup ") + : "") + : "idle ", + dev->name, + downdelay * miimon); + } + } + /* no break ! fall through the BOND_LINK_FAIL test to + ensure proper action to be taken + */ + case BOND_LINK_FAIL: /* the link has just gone down */ + if (link_state != BMSR_LSTATUS) { + /* link stays down */ + if (slave->delay <= 0) { + /* link down for too long time */ + slave->link = BOND_LINK_DOWN; + /* in active/backup mode, we must + completely disable this interface */ + if (bond_mode == BOND_MODE_ACTIVEBACKUP) { + bond_set_slave_inactive_flags(slave); + } + printk(KERN_INFO + "%s: link status definitely down " + "for interface %s, disabling it", + master->name, + dev->name); + + read_lock(&bond->ptrlock); + if (slave == bond->current_slave) { + read_unlock(&bond->ptrlock); + /* find a new interface and be verbose */ + change_active_interface(bond); + } else { + read_unlock(&bond->ptrlock); + printk(".\n"); + } + slave_died = 1; + } else { + slave->delay--; + } + } else { + /* link up again */ + slave->link = BOND_LINK_UP; + slave->jiffies = jiffies; + printk(KERN_INFO + "%s: link status up again after %d ms " + "for interface %s.\n", + master->name, + (downdelay - slave->delay) * miimon, + dev->name); + + if (IS_UP(dev) && (mindelay > -1)) { + /* -1 is a good case : this slave went + down only for a short time */ + mindelay = -1; + bestslave = slave; + } + } + break; + case BOND_LINK_DOWN: /* the link was down */ + if (link_state != BMSR_LSTATUS) { + /* the link stays down, nothing more to do */ + break; + } else { /* link going up */ + slave->link = BOND_LINK_BACK; + slave->delay = updelay; + + if (updelay > 0) { + /* if updelay == 0, no need to + advertise about a 0 ms delay */ + printk (KERN_INFO + "%s: link status up for interface" + " %s, enabling it in %d ms.\n", + master->name, + dev->name, + updelay * miimon); + } + } + /* no break ! fall through the BOND_LINK_BACK state in + case there's something to do. + */ + case BOND_LINK_BACK: /* the link has just come back */ + if (link_state != BMSR_LSTATUS) { + /* link down again */ + slave->link = BOND_LINK_DOWN; + printk(KERN_INFO + "%s: link status down again after %d ms " + "for interface %s.\n", + master->name, + (updelay - slave->delay) * miimon, + dev->name); + } else { + /* link stays up */ + if (slave->delay == 0) { + /* now the link has been up for long time enough */ + slave->link = BOND_LINK_UP; + slave->jiffies = jiffies; + + if (bond_mode != BOND_MODE_ACTIVEBACKUP) { + /* make it immediately active */ + slave->state = BOND_STATE_ACTIVE; + } else if (slave != bond->primary_slave) { + /* prevent it from being the active one */ + slave->state = BOND_STATE_BACKUP; + } + + printk(KERN_INFO + "%s: link status definitely up " + "for interface %s.\n", + master->name, + dev->name); + + if ( (bond->primary_slave != NULL) + && (slave == bond->primary_slave) ) + change_active_interface(bond); + } + else + slave->delay--; + + /* we'll also look for the mostly eligible slave */ + if (bond->primary_slave == NULL) { + if (IS_UP(dev) && (slave->delay < mindelay)) { + mindelay = slave->delay; + bestslave = slave; + } + } else if ( (IS_UP(bond->primary_slave->dev)) || + ( (!IS_UP(bond->primary_slave->dev)) && + (IS_UP(dev) && (slave->delay < mindelay)) ) ) { + mindelay = slave->delay; + bestslave = slave; + } + } + break; + } /* end of switch */ + + bond_update_speed_duplex(slave); + + } /* end of while */ + + /* + * if there's no active interface and we discovered that one + * of the slaves could be activated earlier, so we do it. + */ + read_lock(&bond->ptrlock); + oldcurrent = bond->current_slave; + read_unlock(&bond->ptrlock); + + /* no active interface at the moment or need to bring up the primary */ + if (oldcurrent == NULL) { /* no active interface at the moment */ + if (bestslave != NULL) { /* last chance to find one ? */ + if (bestslave->link == BOND_LINK_UP) { + printk (KERN_INFO + "%s: making interface %s the new active one.\n", + master->name, bestslave->dev->name); + } else { + printk (KERN_INFO + "%s: making interface %s the new " + "active one %d ms earlier.\n", + master->name, bestslave->dev->name, + (updelay - bestslave->delay) * miimon); + + bestslave->delay = 0; + bestslave->link = BOND_LINK_UP; + bestslave->jiffies = jiffies; + } + + if (bond_mode == BOND_MODE_ACTIVEBACKUP) { + bond_set_slave_active_flags(bestslave); + bond_mc_update(bond, bestslave, NULL); + } else { + bestslave->state = BOND_STATE_ACTIVE; + } + write_lock(&bond->ptrlock); + bond->current_slave = bestslave; + write_unlock(&bond->ptrlock); + } else if (slave_died) { + /* print this message only once a slave has just died */ + printk(KERN_INFO + "%s: now running without any active interface !\n", + master->name); + } + } + + read_unlock_irqrestore(&bond->lock, flags); + /* re-arm the timer */ + mod_timer(&bond->mii_timer, jiffies + (miimon * HZ / 1000)); +} + +/* + * this function is called regularly to monitor each slave's link + * ensuring that traffic is being sent and received when arp monitoring + * is used in load-balancing mode. if the adapter has been dormant, then an + * arp is transmitted to generate traffic. see activebackup_arp_monitor for + * arp monitoring in active backup mode. + */ +static void loadbalance_arp_monitor(struct net_device *master) +{ + bonding_t *bond; + unsigned long flags; + slave_t *slave; + int the_delta_in_ticks = arp_interval * HZ / 1000; + int next_timer = jiffies + (arp_interval * HZ / 1000); + + bond = (struct bonding *) master->priv; + if (master->priv == NULL) { + mod_timer(&bond->arp_timer, next_timer); + return; + } + + read_lock_irqsave(&bond->lock, flags); + + /* TODO: investigate why rtnl_shlock_nowait and rtnl_exlock_nowait + * are called below and add comment why they are required... + */ + if ((!IS_UP(master)) || rtnl_shlock_nowait()) { + mod_timer(&bond->arp_timer, next_timer); + read_unlock_irqrestore(&bond->lock, flags); + return; + } + + if (rtnl_exlock_nowait()) { + rtnl_shunlock(); + mod_timer(&bond->arp_timer, next_timer); + read_unlock_irqrestore(&bond->lock, flags); + return; + } + + /* see if any of the previous devices are up now (i.e. they have + * xmt and rcv traffic). the current_slave does not come into + * the picture unless it is null. also, slave->jiffies is not needed + * here because we send an arp on each slave and give a slave as + * long as it needs to get the tx/rx within the delta. + * TODO: what about up/down delay in arp mode? it wasn't here before + * so it can wait + */ + slave = (slave_t *)bond; + while ((slave = slave->prev) != (slave_t *)bond) { + + if (slave->link != BOND_LINK_UP) { + + if (((jiffies - slave->dev->trans_start) <= + the_delta_in_ticks) && + ((jiffies - slave->dev->last_rx) <= + the_delta_in_ticks)) { + + slave->link = BOND_LINK_UP; + slave->state = BOND_STATE_ACTIVE; + + /* primary_slave has no meaning in round-robin + * mode. the window of a slave being up and + * current_slave being null after enslaving + * is closed. + */ + read_lock(&bond->ptrlock); + if (bond->current_slave == NULL) { + read_unlock(&bond->ptrlock); + printk(KERN_INFO + "%s: link status definitely up " + "for interface %s, ", + master->name, + slave->dev->name); + change_active_interface(bond); + } else { + read_unlock(&bond->ptrlock); + printk(KERN_INFO + "%s: interface %s is now up\n", + master->name, + slave->dev->name); + } + } + } else { + /* slave->link == BOND_LINK_UP */ + + /* not all switches will respond to an arp request + * when the source ip is 0, so don't take the link down + * if we don't know our ip yet + */ + if (((jiffies - slave->dev->trans_start) >= + (2*the_delta_in_ticks)) || + (((jiffies - slave->dev->last_rx) >= + (2*the_delta_in_ticks)) && my_ip !=0)) { + slave->link = BOND_LINK_DOWN; + slave->state = BOND_STATE_BACKUP; + if (slave->link_failure_count < UINT_MAX) { + slave->link_failure_count++; + } + printk(KERN_INFO + "%s: interface %s is now down.\n", + master->name, + slave->dev->name); + + read_lock(&bond->ptrlock); + if (slave == bond->current_slave) { + read_unlock(&bond->ptrlock); + change_active_interface(bond); + } else { + read_unlock(&bond->ptrlock); + } + } + } + + /* note: if switch is in round-robin mode, all links + * must tx arp to ensure all links rx an arp - otherwise + * links may oscillate or not come up at all; if switch is + * in something like xor mode, there is nothing we can + * do - all replies will be rx'ed on same link causing slaves + * to be unstable during low/no traffic periods + */ + if (IS_UP(slave->dev)) { + arp_send_all(slave); + } + } + + rtnl_exunlock(); + rtnl_shunlock(); + read_unlock_irqrestore(&bond->lock, flags); + + /* re-arm the timer */ + mod_timer(&bond->arp_timer, next_timer); +} + +/* + * When using arp monitoring in active-backup mode, this function is + * called to determine if any backup slaves have went down or a new + * current slave needs to be found. + * The backup slaves never generate traffic, they are considered up by merely + * receiving traffic. If the current slave goes down, each backup slave will + * be given the opportunity to tx/rx an arp before being taken down - this + * prevents all slaves from being taken down due to the current slave not + * sending any traffic for the backups to receive. The arps are not necessarily + * necessary, any tx and rx traffic will keep the current slave up. While any + * rx traffic will keep the backup slaves up, the current slave is responsible + * for generating traffic to keep them up regardless of any other traffic they + * may have received. + * see loadbalance_arp_monitor for arp monitoring in load balancing mode + */ +static void activebackup_arp_monitor(struct net_device *master) +{ + bonding_t *bond; + unsigned long flags; + slave_t *slave; + int the_delta_in_ticks = arp_interval * HZ / 1000; + int next_timer = jiffies + (arp_interval * HZ / 1000); + + bond = (struct bonding *) master->priv; + if (master->priv == NULL) { + mod_timer(&bond->arp_timer, next_timer); + return; + } + + read_lock_irqsave(&bond->lock, flags); + + if (!IS_UP(master)) { + mod_timer(&bond->arp_timer, next_timer); + read_unlock_irqrestore(&bond->lock, flags); + return; + } + + /* determine if any slave has come up or any backup slave has + * gone down + * TODO: what about up/down delay in arp mode? it wasn't here before + * so it can wait + */ + slave = (slave_t *)bond; + while ((slave = slave->prev) != (slave_t *)bond) { + + if (slave->link != BOND_LINK_UP) { + if ((jiffies - slave->dev->last_rx) <= + the_delta_in_ticks) { + + slave->link = BOND_LINK_UP; + write_lock(&bond->ptrlock); + if ((bond->current_slave == NULL) && + ((jiffies - slave->dev->trans_start) <= + the_delta_in_ticks)) { + bond->current_slave = slave; + bond_set_slave_active_flags(slave); + bond_mc_update(bond, slave, NULL); + bond->current_arp_slave = NULL; + } else if (bond->current_slave != slave) { + /* this slave has just come up but we + * already have a current slave; this + * can also happen if bond_enslave adds + * a new slave that is up while we are + * searching for a new slave + */ + bond_set_slave_inactive_flags(slave); + bond->current_arp_slave = NULL; + } + + if (slave == bond->current_slave) { + printk(KERN_INFO + "%s: %s is up and now the " + "active interface\n", + master->name, + slave->dev->name); + } else { + printk(KERN_INFO + "%s: backup interface %s is " + "now up\n", + master->name, + slave->dev->name); + } + + write_unlock(&bond->ptrlock); + } + } else { + read_lock(&bond->ptrlock); + if ((slave != bond->current_slave) && + (bond->current_arp_slave == NULL) && + (((jiffies - slave->dev->last_rx) >= + 3*the_delta_in_ticks) && (my_ip != 0))) { + /* a backup slave has gone down; three times + * the delta allows the current slave to be + * taken out before the backup slave. + * note: a non-null current_arp_slave indicates + * the current_slave went down and we are + * searching for a new one; under this + * condition we only take the current_slave + * down - this gives each slave a chance to + * tx/rx traffic before being taken out + */ + read_unlock(&bond->ptrlock); + slave->link = BOND_LINK_DOWN; + if (slave->link_failure_count < UINT_MAX) { + slave->link_failure_count++; + } + bond_set_slave_inactive_flags(slave); + printk(KERN_INFO + "%s: backup interface %s is now down\n", + master->name, + slave->dev->name); + } else { + read_unlock(&bond->ptrlock); + } + } + } + + read_lock(&bond->ptrlock); + slave = bond->current_slave; + read_unlock(&bond->ptrlock); + + if (slave != NULL) { + + /* if we have sent traffic in the past 2*arp_intervals but + * haven't xmit and rx traffic in that time interval, select + * a different slave. slave->jiffies is only updated when + * a slave first becomes the current_slave - not necessarily + * after every arp; this ensures the slave has a full 2*delta + * before being taken out. if a primary is being used, check + * if it is up and needs to take over as the current_slave + */ + if ((((jiffies - slave->dev->trans_start) >= + (2*the_delta_in_ticks)) || + (((jiffies - slave->dev->last_rx) >= + (2*the_delta_in_ticks)) && (my_ip != 0))) && + ((jiffies - slave->jiffies) >= 2*the_delta_in_ticks)) { + + slave->link = BOND_LINK_DOWN; + if (slave->link_failure_count < UINT_MAX) { + slave->link_failure_count++; + } + printk(KERN_INFO "%s: link status down for " + "active interface %s, disabling it", + master->name, + slave->dev->name); + slave = change_active_interface(bond); + bond->current_arp_slave = slave; + if (slave != NULL) { + slave->jiffies = jiffies; + } + + } else if ((bond->primary_slave != NULL) && + (bond->primary_slave != slave) && + (bond->primary_slave->link == BOND_LINK_UP)) { + /* at this point, slave is the current_slave */ + printk(KERN_INFO + "%s: changing from interface %s to primary " + "interface %s\n", + master->name, + slave->dev->name, + bond->primary_slave->dev->name); + + /* primary is up so switch to it */ + bond_set_slave_inactive_flags(slave); + bond_mc_update(bond, bond->primary_slave, slave); + write_lock(&bond->ptrlock); + bond->current_slave = bond->primary_slave; + write_unlock(&bond->ptrlock); + slave = bond->primary_slave; + bond_set_slave_active_flags(slave); + slave->jiffies = jiffies; + } else { + bond->current_arp_slave = NULL; + } + + /* the current slave must tx an arp to ensure backup slaves + * rx traffic + */ + if ((slave != NULL) && + (((jiffies - slave->dev->last_rx) >= the_delta_in_ticks) && + (my_ip != 0))) { + arp_send_all(slave); + } + } + + /* if we don't have a current_slave, search for the next available + * backup slave from the current_arp_slave and make it the candidate + * for becoming the current_slave + */ + if (slave == NULL) { + + if ((bond->current_arp_slave == NULL) || + (bond->current_arp_slave == (slave_t *)bond)) { + bond->current_arp_slave = bond->prev; + } + + if (bond->current_arp_slave != (slave_t *)bond) { + bond_set_slave_inactive_flags(bond->current_arp_slave); + slave = bond->current_arp_slave->next; + + /* search for next candidate */ + do { + if (IS_UP(slave->dev)) { + slave->link = BOND_LINK_BACK; + bond_set_slave_active_flags(slave); + arp_send_all(slave); + slave->jiffies = jiffies; + bond->current_arp_slave = slave; + break; + } + + /* if the link state is up at this point, we + * mark it down - this can happen if we have + * simultaneous link failures and + * change_active_interface doesn't make this + * one the current slave so it is still marked + * up when it is actually down + */ + if (slave->link == BOND_LINK_UP) { + slave->link = BOND_LINK_DOWN; + if (slave->link_failure_count < + UINT_MAX) { + slave->link_failure_count++; + } + + bond_set_slave_inactive_flags(slave); + printk(KERN_INFO + "%s: backup interface " + "%s is now down.\n", + master->name, + slave->dev->name); + } + } while ((slave = slave->next) != + bond->current_arp_slave->next); + } + } + + mod_timer(&bond->arp_timer, next_timer); + read_unlock_irqrestore(&bond->lock, flags); +} + +typedef uint32_t in_addr_t; + +int +my_inet_aton(char *cp, unsigned long *the_addr) { + static const in_addr_t max[4] = { 0xffffffff, 0xffffff, 0xffff, 0xff }; + in_addr_t val; + char c; + union iaddr { + uint8_t bytes[4]; + uint32_t word; + } res; + uint8_t *pp = res.bytes; + int digit,base; + + res.word = 0; + + c = *cp; + for (;;) { + /* + * Collect number up to ``.''. + * Values are specified as for C: + * 0x=hex, 0=octal, isdigit=decimal. + */ + if (!isdigit(c)) goto ret_0; + val = 0; base = 10; digit = 0; + for (;;) { + if (isdigit(c)) { + val = (val * base) + (c - '0'); + c = *++cp; + digit = 1; + } else { + break; + } + } + if (c == '.') { + /* + * Internet format: + * a.b.c.d + * a.b.c (with c treated as 16 bits) + * a.b (with b treated as 24 bits) + */ + if (pp > res.bytes + 2 || val > 0xff) { + goto ret_0; + } + *pp++ = val; + c = *++cp; + } else + break; + } + /* + * Check for trailing characters. + */ + if (c != '\0' && (!isascii(c) || !isspace(c))) { + goto ret_0; + } + /* + * Did we get a valid digit? + */ + if (!digit) { + goto ret_0; + } + + /* Check whether the last part is in its limits depending on + the number of parts in total. */ + if (val > max[pp - res.bytes]) { + goto ret_0; + } + + if (the_addr != NULL) { + *the_addr = res.word | htonl (val); + } + + return (1); + +ret_0: + return (0); +} + +static int bond_sethwaddr(struct net_device *master, struct net_device *slave) +{ +#ifdef BONDING_DEBUG + printk(KERN_CRIT "bond_sethwaddr: master=%x\n", (unsigned int)master); + printk(KERN_CRIT "bond_sethwaddr: slave=%x\n", (unsigned int)slave); + printk(KERN_CRIT "bond_sethwaddr: slave->addr_len=%d\n", slave->addr_len); +#endif + memcpy(master->dev_addr, slave->dev_addr, slave->addr_len); + return 0; +} + +static int bond_info_query(struct net_device *master, struct ifbond *info) +{ + bonding_t *bond = (struct bonding *) master->priv; + slave_t *slave; + unsigned long flags; + + info->bond_mode = bond_mode; + info->num_slaves = 0; + info->miimon = miimon; + + read_lock_irqsave(&bond->lock, flags); + for (slave = bond->prev; slave != (slave_t *)bond; slave = slave->prev) { + info->num_slaves++; + } + read_unlock_irqrestore(&bond->lock, flags); + + return 0; +} + +static int bond_slave_info_query(struct net_device *master, + struct ifslave *info) +{ + bonding_t *bond = (struct bonding *) master->priv; + slave_t *slave; + int cur_ndx = 0; + unsigned long flags; + + if (info->slave_id < 0) { + return -ENODEV; + } + + read_lock_irqsave(&bond->lock, flags); + for (slave = bond->prev; + slave != (slave_t *)bond && cur_ndx < info->slave_id; + slave = slave->prev) { + cur_ndx++; + } + read_unlock_irqrestore(&bond->lock, flags); + + if (slave != (slave_t *)bond) { + strcpy(info->slave_name, slave->dev->name); + info->link = slave->link; + info->state = slave->state; + info->link_failure_count = slave->link_failure_count; + } else { + return -ENODEV; + } + + return 0; +} + +static int bond_ioctl(struct net_device *master_dev, struct ifreq *ifr, int cmd) +{ + struct net_device *slave_dev = NULL; + struct ifbond *u_binfo = NULL, k_binfo; + struct ifslave *u_sinfo = NULL, k_sinfo; + struct mii_ioctl_data *mii = NULL; + int ret = 0; + +#ifdef BONDING_DEBUG + printk(KERN_INFO "bond_ioctl: master=%s, cmd=%d\n", + master_dev->name, cmd); +#endif + + switch (cmd) { + case SIOCGMIIPHY: + mii = (struct mii_ioctl_data *)&ifr->ifr_data; + if (mii == NULL) { + return -EINVAL; + } + mii->phy_id = 0; + /* Fall Through */ + case SIOCGMIIREG: + /* + * We do this again just in case we were called by SIOCGMIIREG + * instead of SIOCGMIIPHY. + */ + mii = (struct mii_ioctl_data *)&ifr->ifr_data; + if (mii == NULL) { + return -EINVAL; + } + if (mii->reg_num == 1) { + mii->val_out = bond_check_mii_link( + (struct bonding *)master_dev->priv); + } + return 0; + case BOND_INFO_QUERY_OLD: + case SIOCBONDINFOQUERY: + u_binfo = (struct ifbond *)ifr->ifr_data; + if (copy_from_user(&k_binfo, u_binfo, sizeof(ifbond))) { + return -EFAULT; + } + ret = bond_info_query(master_dev, &k_binfo); + if (ret == 0) { + if (copy_to_user(u_binfo, &k_binfo, sizeof(ifbond))) { + return -EFAULT; + } + } + return ret; + case BOND_SLAVE_INFO_QUERY_OLD: + case SIOCBONDSLAVEINFOQUERY: + u_sinfo = (struct ifslave *)ifr->ifr_data; + if (copy_from_user(&k_sinfo, u_sinfo, sizeof(ifslave))) { + return -EFAULT; + } + ret = bond_slave_info_query(master_dev, &k_sinfo); + if (ret == 0) { + if (copy_to_user(u_sinfo, &k_sinfo, sizeof(ifslave))) { + return -EFAULT; + } + } + return ret; + } + + if (!capable(CAP_NET_ADMIN)) { + return -EPERM; + } + + slave_dev = dev_get_by_name(ifr->ifr_slave); + +#ifdef BONDING_DEBUG + printk(KERN_INFO "slave_dev=%x: \n", (unsigned int)slave_dev); + printk(KERN_INFO "slave_dev->name=%s: \n", slave_dev->name); +#endif + + if (slave_dev == NULL) { + ret = -ENODEV; + } else { + switch (cmd) { + case BOND_ENSLAVE_OLD: + case SIOCBONDENSLAVE: + ret = bond_enslave(master_dev, slave_dev); + break; + case BOND_RELEASE_OLD: + case SIOCBONDRELEASE: + ret = bond_release(master_dev, slave_dev); + break; + case BOND_SETHWADDR_OLD: + case SIOCBONDSETHWADDR: + ret = bond_sethwaddr(master_dev, slave_dev); + break; + case BOND_CHANGE_ACTIVE_OLD: + case SIOCBONDCHANGEACTIVE: + if (bond_mode == BOND_MODE_ACTIVEBACKUP) { + ret = bond_change_active(master_dev, slave_dev); + } + else { + ret = -EINVAL; + } + break; + default: + ret = -EOPNOTSUPP; + } + dev_put(slave_dev); + } + return ret; +} + +#ifdef CONFIG_NET_FASTROUTE +static int bond_accept_fastpath(struct net_device *dev, struct dst_entry *dst) +{ + return -1; +} +#endif + +/* + * in broadcast mode, we send everything to all usable interfaces. + */ +static int bond_xmit_broadcast(struct sk_buff *skb, struct net_device *dev) +{ + slave_t *slave, *start_at; + struct bonding *bond = (struct bonding *) dev->priv; + unsigned long flags; + struct net_device *device_we_should_send_to = 0; + + if (!IS_UP(dev)) { /* bond down */ + dev_kfree_skb(skb); + return 0; + } + + read_lock_irqsave(&bond->lock, flags); + + read_lock(&bond->ptrlock); + slave = start_at = bond->current_slave; + read_unlock(&bond->ptrlock); + + if (slave == NULL) { /* we're at the root, get the first slave */ + /* no suitable interface, frame not sent */ + read_unlock_irqrestore(&bond->lock, flags); + dev_kfree_skb(skb); + return 0; + } + + do { + if (IS_UP(slave->dev) + && (slave->link == BOND_LINK_UP) + && (slave->state == BOND_STATE_ACTIVE)) { + if (device_we_should_send_to) { + struct sk_buff *skb2; + if ((skb2 = skb_clone(skb, GFP_ATOMIC)) == NULL) { + printk(KERN_ERR "bond_xmit_broadcast: skb_clone() failed\n"); + continue; + } + + skb2->dev = device_we_should_send_to; + skb2->priority = 1; + dev_queue_xmit(skb2); + } + device_we_should_send_to = slave->dev; + } + } while ((slave = slave->next) != start_at); + + if (device_we_should_send_to) { + skb->dev = device_we_should_send_to; + skb->priority = 1; + dev_queue_xmit(skb); + } else + dev_kfree_skb(skb); + + /* frame sent to all suitable interfaces */ + read_unlock_irqrestore(&bond->lock, flags); + return 0; +} + +static int bond_xmit_roundrobin(struct sk_buff *skb, struct net_device *dev) +{ + slave_t *slave, *start_at; + struct bonding *bond = (struct bonding *) dev->priv; + unsigned long flags; + + if (!IS_UP(dev)) { /* bond down */ + dev_kfree_skb(skb); + return 0; + } + + read_lock_irqsave(&bond->lock, flags); + + read_lock(&bond->ptrlock); + slave = start_at = bond->current_slave; + read_unlock(&bond->ptrlock); + + if (slave == NULL) { /* we're at the root, get the first slave */ + /* no suitable interface, frame not sent */ + dev_kfree_skb(skb); + read_unlock_irqrestore(&bond->lock, flags); + return 0; + } + + do { + if (IS_UP(slave->dev) + && (slave->link == BOND_LINK_UP) + && (slave->state == BOND_STATE_ACTIVE)) { + + skb->dev = slave->dev; + skb->priority = 1; + dev_queue_xmit(skb); + + write_lock(&bond->ptrlock); + bond->current_slave = slave->next; + write_unlock(&bond->ptrlock); + + read_unlock_irqrestore(&bond->lock, flags); + return 0; + } + } while ((slave = slave->next) != start_at); + + /* no suitable interface, frame not sent */ + dev_kfree_skb(skb); + read_unlock_irqrestore(&bond->lock, flags); + return 0; +} + +/* + * in XOR mode, we determine the output device by performing xor on + * the source and destination hw adresses. If this device is not + * enabled, find the next slave following this xor slave. + */ +static int bond_xmit_xor(struct sk_buff *skb, struct net_device *dev) +{ + slave_t *slave, *start_at; + struct bonding *bond = (struct bonding *) dev->priv; + unsigned long flags; + struct ethhdr *data = (struct ethhdr *)skb->data; + int slave_no; + + if (!IS_UP(dev)) { /* bond down */ + dev_kfree_skb(skb); + return 0; + } + + read_lock_irqsave(&bond->lock, flags); + slave = bond->prev; + + /* we're at the root, get the first slave */ + if (bond->slave_cnt == 0) { + /* no suitable interface, frame not sent */ + dev_kfree_skb(skb); + read_unlock_irqrestore(&bond->lock, flags); + return 0; + } + + slave_no = (data->h_dest[5]^slave->dev->dev_addr[5]) % bond->slave_cnt; + + while ( (slave_no > 0) && (slave != (slave_t *)bond) ) { + slave = slave->prev; + slave_no--; + } + start_at = slave; + + do { + if (IS_UP(slave->dev) + && (slave->link == BOND_LINK_UP) + && (slave->state == BOND_STATE_ACTIVE)) { + + skb->dev = slave->dev; + skb->priority = 1; + dev_queue_xmit(skb); + + read_unlock_irqrestore(&bond->lock, flags); + return 0; + } + } while ((slave = slave->next) != start_at); + + /* no suitable interface, frame not sent */ + dev_kfree_skb(skb); + read_unlock_irqrestore(&bond->lock, flags); + return 0; +} + +/* + * in active-backup mode, we know that bond->current_slave is always valid if + * the bond has a usable interface. + */ +static int bond_xmit_activebackup(struct sk_buff *skb, struct net_device *dev) +{ + struct bonding *bond = (struct bonding *) dev->priv; + unsigned long flags; + int ret; + + if (!IS_UP(dev)) { /* bond down */ + dev_kfree_skb(skb); + return 0; + } + + /* if we are sending arp packets, try to at least + identify our own ip address */ + if ( (arp_interval > 0) && (my_ip == 0) && + (skb->protocol == __constant_htons(ETH_P_ARP) ) ) { + char *the_ip = (((char *)skb->data)) + + sizeof(struct ethhdr) + + sizeof(struct arphdr) + + ETH_ALEN; + memcpy(&my_ip, the_ip, 4); + } + + /* if we are sending arp packets and don't know + * the target hw address, save it so we don't need + * to use a broadcast address. + * don't do this if in active backup mode because the slaves must + * receive packets to stay up, and the only ones they receive are + * broadcasts. + */ + if ( (bond_mode != BOND_MODE_ACTIVEBACKUP) && + (arp_ip_count == 1) && + (arp_interval > 0) && (arp_target_hw_addr == NULL) && + (skb->protocol == __constant_htons(ETH_P_IP) ) ) { + struct ethhdr *eth_hdr = + (struct ethhdr *) (((char *)skb->data)); + struct iphdr *ip_hdr = (struct iphdr *)(eth_hdr + 1); + + if (arp_target[0] == ip_hdr->daddr) { + arp_target_hw_addr = kmalloc(ETH_ALEN, GFP_KERNEL); + if (arp_target_hw_addr != NULL) + memcpy(arp_target_hw_addr, eth_hdr->h_dest, ETH_ALEN); + } + } + + read_lock_irqsave(&bond->lock, flags); + + read_lock(&bond->ptrlock); + if (bond->current_slave != NULL) { /* one usable interface */ + skb->dev = bond->current_slave->dev; + read_unlock(&bond->ptrlock); + skb->priority = 1; + ret = dev_queue_xmit(skb); + read_unlock_irqrestore(&bond->lock, flags); + return 0; + } + else { + read_unlock(&bond->ptrlock); + } + + /* no suitable interface, frame not sent */ +#ifdef BONDING_DEBUG + printk(KERN_INFO "There was no suitable interface, so we don't transmit\n"); +#endif + dev_kfree_skb(skb); + read_unlock_irqrestore(&bond->lock, flags); + return 0; +} + +static struct net_device_stats *bond_get_stats(struct net_device *dev) +{ + bonding_t *bond = dev->priv; + struct net_device_stats *stats = bond->stats, *sstats; + slave_t *slave; + unsigned long flags; + + memset(bond->stats, 0, sizeof(struct net_device_stats)); + + read_lock_irqsave(&bond->lock, flags); + + for (slave = bond->prev; slave != (slave_t *)bond; slave = slave->prev) { + sstats = slave->dev->get_stats(slave->dev); + + stats->rx_packets += sstats->rx_packets; + stats->rx_bytes += sstats->rx_bytes; + stats->rx_errors += sstats->rx_errors; + stats->rx_dropped += sstats->rx_dropped; + + stats->tx_packets += sstats->tx_packets; + stats->tx_bytes += sstats->tx_bytes; + stats->tx_errors += sstats->tx_errors; + stats->tx_dropped += sstats->tx_dropped; + + stats->multicast += sstats->multicast; + stats->collisions += sstats->collisions; + + stats->rx_length_errors += sstats->rx_length_errors; + stats->rx_over_errors += sstats->rx_over_errors; + stats->rx_crc_errors += sstats->rx_crc_errors; + stats->rx_frame_errors += sstats->rx_frame_errors; + stats->rx_fifo_errors += sstats->rx_fifo_errors; + stats->rx_missed_errors += sstats->rx_missed_errors; + + stats->tx_aborted_errors += sstats->tx_aborted_errors; + stats->tx_carrier_errors += sstats->tx_carrier_errors; + stats->tx_fifo_errors += sstats->tx_fifo_errors; + stats->tx_heartbeat_errors += sstats->tx_heartbeat_errors; + stats->tx_window_errors += sstats->tx_window_errors; + + } + + read_unlock_irqrestore(&bond->lock, flags); + return stats; +} + +static int bond_get_info(char *buf, char **start, off_t offset, int length) +{ + bonding_t *bond = these_bonds; + int len = 0; + off_t begin = 0; + u16 link; + slave_t *slave = NULL; + unsigned long flags; + + while (bond != NULL) { + /* + * This function locks the mutex, so we can't lock it until + * afterwards + */ + link = bond_check_mii_link(bond); + + len += sprintf(buf + len, "Bonding Mode: %s\n", + bond_mode_name()); + + if (bond_mode == BOND_MODE_ACTIVEBACKUP) { + read_lock_irqsave(&bond->lock, flags); + read_lock(&bond->ptrlock); + if (bond->current_slave != NULL) { + len += sprintf(buf + len, + "Currently Active Slave: %s\n", + bond->current_slave->dev->name); + } + read_unlock(&bond->ptrlock); + read_unlock_irqrestore(&bond->lock, flags); + } + + len += sprintf(buf + len, "MII Status: "); + len += sprintf(buf + len, + link == BMSR_LSTATUS ? "up\n" : "down\n"); + len += sprintf(buf + len, "MII Polling Interval (ms): %d\n", + miimon); + len += sprintf(buf + len, "Up Delay (ms): %d\n", + updelay * miimon); + len += sprintf(buf + len, "Down Delay (ms): %d\n", + downdelay * miimon); + len += sprintf(buf + len, "Multicast Mode: %s\n", + multicast_mode_name()); + + read_lock_irqsave(&bond->lock, flags); + for (slave = bond->prev; slave != (slave_t *)bond; + slave = slave->prev) { + len += sprintf(buf + len, "\nSlave Interface: %s\n", slave->dev->name); + + len += sprintf(buf + len, "MII Status: "); + + len += sprintf(buf + len, + slave->link == BOND_LINK_UP ? + "up\n" : "down\n"); + len += sprintf(buf + len, "Link Failure Count: %d\n", + slave->link_failure_count); + + len += sprintf(buf + len, + "Permanent HW addr: %02x:%02x:%02x:%02x:%02x:%02x\n", + slave->perm_hwaddr[0], + slave->perm_hwaddr[1], + slave->perm_hwaddr[2], + slave->perm_hwaddr[3], + slave->perm_hwaddr[4], + slave->perm_hwaddr[5]); + } + read_unlock_irqrestore(&bond->lock, flags); + + /* + * Figure out the calcs for the /proc/net interface + */ + *start = buf + (offset - begin); + len -= (offset - begin); + if (len > length) { + len = length; + } + if (len < 0) { + len = 0; + } + + + bond = bond->next_bond; + } + return len; +} + +static int bond_event(struct notifier_block *this, unsigned long event, + void *ptr) +{ + struct bonding *this_bond = (struct bonding *)these_bonds; + struct bonding *last_bond; + struct net_device *event_dev = (struct net_device *)ptr; + + /* while there are bonds configured */ + while (this_bond != NULL) { + if (this_bond == event_dev->priv ) { + switch (event) { + case NETDEV_UNREGISTER: + /* + * remove this bond from a linked list of + * bonds + */ + if (this_bond == these_bonds) { + these_bonds = this_bond->next_bond; + } else { + for (last_bond = these_bonds; + last_bond != NULL; + last_bond = last_bond->next_bond) { + if (last_bond->next_bond == + this_bond) { + last_bond->next_bond = + this_bond->next_bond; + } + } + } + return NOTIFY_DONE; + + default: + return NOTIFY_DONE; + } + } else if (this_bond->device == event_dev->master) { + switch (event) { + case NETDEV_UNREGISTER: + bond_release(this_bond->device, event_dev); + break; + } + return NOTIFY_DONE; + } + this_bond = this_bond->next_bond; + } + return NOTIFY_DONE; +} + +static struct notifier_block bond_netdev_notifier = { + notifier_call: bond_event, +}; + +static int __init bond_init(struct net_device *dev) +{ + bonding_t *bond, *this_bond, *last_bond; + int count; + +#ifdef BONDING_DEBUG + printk (KERN_INFO "Begin bond_init for %s\n", dev->name); +#endif + bond = kmalloc(sizeof(struct bonding), GFP_KERNEL); + if (bond == NULL) { + return -ENOMEM; + } + memset(bond, 0, sizeof(struct bonding)); + + /* initialize rwlocks */ + rwlock_init(&bond->lock); + rwlock_init(&bond->ptrlock); + + bond->stats = kmalloc(sizeof(struct net_device_stats), GFP_KERNEL); + if (bond->stats == NULL) { + kfree(bond); + return -ENOMEM; + } + memset(bond->stats, 0, sizeof(struct net_device_stats)); + + bond->next = bond->prev = (slave_t *)bond; + bond->current_slave = NULL; + bond->current_arp_slave = NULL; + bond->device = dev; + dev->priv = bond; + + /* Initialize the device structure. */ + switch (bond_mode) { + case BOND_MODE_ACTIVEBACKUP: + dev->hard_start_xmit = bond_xmit_activebackup; + break; + case BOND_MODE_ROUNDROBIN: + dev->hard_start_xmit = bond_xmit_roundrobin; + break; + case BOND_MODE_XOR: + dev->hard_start_xmit = bond_xmit_xor; + break; + case BOND_MODE_BROADCAST: + dev->hard_start_xmit = bond_xmit_broadcast; + break; + default: + printk(KERN_ERR "Unknown bonding mode %d\n", bond_mode); + kfree(bond->stats); + kfree(bond); + return -EINVAL; + } + + dev->get_stats = bond_get_stats; + dev->open = bond_open; + dev->stop = bond_close; + dev->set_multicast_list = set_multicast_list; + dev->do_ioctl = bond_ioctl; + + /* + * Fill in the fields of the device structure with ethernet-generic + * values. + */ + + ether_setup(dev); + + dev->tx_queue_len = 0; + dev->flags |= IFF_MASTER|IFF_MULTICAST; +#ifdef CONFIG_NET_FASTROUTE + dev->accept_fastpath = bond_accept_fastpath; +#endif + + printk(KERN_INFO "%s registered with", dev->name); + if (miimon > 0) { + printk(" MII link monitoring set to %d ms", miimon); + updelay /= miimon; + downdelay /= miimon; + } else { + printk("out MII link monitoring"); + } + printk(", in %s mode.\n", bond_mode_name()); + + printk(KERN_INFO "%s registered with", dev->name); + if (arp_interval > 0) { + printk(" ARP monitoring set to %d ms with %d target(s):", + arp_interval, arp_ip_count); + for (count=0 ; countbond_proc_dir = proc_mkdir(dev->name, proc_net); + if (bond->bond_proc_dir == NULL) { + printk(KERN_ERR "%s: Cannot init /proc/net/%s/\n", + dev->name, dev->name); + kfree(bond->stats); + kfree(bond); + return -ENOMEM; + } + bond->bond_proc_info_file = + create_proc_info_entry("info", 0, bond->bond_proc_dir, + bond_get_info); + if (bond->bond_proc_info_file == NULL) { + printk(KERN_ERR "%s: Cannot init /proc/net/%s/info\n", + dev->name, dev->name); + remove_proc_entry(dev->name, proc_net); + kfree(bond->stats); + kfree(bond); + return -ENOMEM; + } +#endif /* CONFIG_PROC_FS */ + + if (first_pass == 1) { + these_bonds = bond; + register_netdevice_notifier(&bond_netdev_notifier); + first_pass = 0; + } else { + last_bond = these_bonds; + this_bond = these_bonds->next_bond; + while (this_bond != NULL) { + last_bond = this_bond; + this_bond = this_bond->next_bond; + } + last_bond->next_bond = bond; + } + + return 0; +} + +/* +static int __init bond_probe(struct net_device *dev) +{ + bond_init(dev); + return 0; +} + */ + +/* + * Convert string input module parms. Accept either the + * number of the mode or its string name. + */ +static inline int +bond_parse_parm(char *mode_arg, struct bond_parm_tbl *tbl) +{ + int i; + + for (i = 0; tbl[i].modename != NULL; i++) { + if ((isdigit(*mode_arg) && + tbl[i].mode == simple_strtol(mode_arg, NULL, 0)) || + (0 == strncmp(mode_arg, tbl[i].modename, + strlen(tbl[i].modename)))) { + return tbl[i].mode; + } + } + + return -1; +} + + +static int __init bonding_init(void) +{ + int no; + int err; + + /* Find a name for this unit */ + static struct net_device *dev_bond = NULL; + + printk(KERN_INFO "%s", version); + + /* + * Convert string parameters. + */ + if (mode) { + bond_mode = bond_parse_parm(mode, bond_mode_tbl); + if (bond_mode == -1) { + printk(KERN_WARNING + "bonding_init(): Invalid bonding mode \"%s\"\n", + mode == NULL ? "NULL" : mode); + return -EINVAL; + } + } + + if (multicast) { + multicast_mode = bond_parse_parm(multicast, bond_mc_tbl); + if (multicast_mode == -1) { + printk(KERN_WARNING + "bonding_init(): Invalid multicast mode \"%s\"\n", + multicast == NULL ? "NULL" : multicast); + return -EINVAL; + } + } + + if (max_bonds < 1 || max_bonds > INT_MAX) { + printk(KERN_WARNING + "bonding_init(): max_bonds (%d) not in range %d-%d, " + "so it was reset to BOND_DEFAULT_MAX_BONDS (%d)", + max_bonds, 1, INT_MAX, BOND_DEFAULT_MAX_BONDS); + max_bonds = BOND_DEFAULT_MAX_BONDS; + } + dev_bond = dev_bonds = kmalloc(max_bonds*sizeof(struct net_device), + GFP_KERNEL); + if (dev_bond == NULL) { + return -ENOMEM; + } + memset(dev_bonds, 0, max_bonds*sizeof(struct net_device)); + + if (miimon < 0) { + printk(KERN_WARNING + "bonding_init(): miimon module parameter (%d), " + "not in range 0-%d, so it was reset to %d\n", + miimon, INT_MAX, BOND_LINK_MON_INTERV); + miimon = BOND_LINK_MON_INTERV; + } + + if (updelay < 0) { + printk(KERN_WARNING + "bonding_init(): updelay module parameter (%d), " + "not in range 0-%d, so it was reset to 0\n", + updelay, INT_MAX); + updelay = 0; + } + + if (downdelay < 0) { + printk(KERN_WARNING + "bonding_init(): downdelay module parameter (%d), " + "not in range 0-%d, so it was reset to 0\n", + downdelay, INT_MAX); + downdelay = 0; + } + + if (miimon == 0) { + if ((updelay != 0) || (downdelay != 0)) { + /* just warn the user the up/down delay will have + * no effect since miimon is zero... + */ + printk(KERN_WARNING + "bonding_init(): miimon module parameter not " + "set and updelay (%d) or downdelay (%d) module " + "parameter is set; updelay and downdelay have " + "no effect unless miimon is set\n", + updelay, downdelay); + } + } else { + /* don't allow arp monitoring */ + if (arp_interval != 0) { + printk(KERN_WARNING + "bonding_init(): miimon (%d) and arp_interval " + "(%d) can't be used simultaneously, " + "disabling ARP monitoring\n", + miimon, arp_interval); + arp_interval = 0; + } + + if ((updelay % miimon) != 0) { + /* updelay will be rounded in bond_init() when it + * is divided by miimon, we just inform user here + */ + printk(KERN_WARNING + "bonding_init(): updelay (%d) is not a multiple " + "of miimon (%d), updelay rounded to %d ms\n", + updelay, miimon, (updelay / miimon) * miimon); + } + + if ((downdelay % miimon) != 0) { + /* downdelay will be rounded in bond_init() when it + * is divided by miimon, we just inform user here + */ + printk(KERN_WARNING + "bonding_init(): downdelay (%d) is not a " + "multiple of miimon (%d), downdelay rounded " + "to %d ms\n", + downdelay, miimon, + (downdelay / miimon) * miimon); + } + } + + if (arp_interval < 0) { + printk(KERN_WARNING + "bonding_init(): arp_interval module parameter (%d), " + "not in range 0-%d, so it was reset to %d\n", + arp_interval, INT_MAX, BOND_LINK_ARP_INTERV); + arp_interval = BOND_LINK_ARP_INTERV; + } + + for (arp_ip_count=0 ; + (arp_ip_count < MAX_ARP_IP_TARGETS) && arp_ip_target[arp_ip_count]; + arp_ip_count++ ) { + /* TODO: check and log bad ip address */ + if (my_inet_aton(arp_ip_target[arp_ip_count], + &arp_target[arp_ip_count]) == 0) { + printk(KERN_WARNING + "bonding_init(): bad arp_ip_target module " + "parameter (%s), ARP monitoring will not be " + "performed\n", + arp_ip_target[arp_ip_count]); + arp_interval = 0; + } + } + + + if ( (arp_interval > 0) && (arp_ip_count==0)) { + /* don't allow arping if no arp_ip_target given... */ + printk(KERN_WARNING + "bonding_init(): arp_interval module parameter " + "(%d) specified without providing an arp_ip_target " + "parameter, arp_interval was reset to 0\n", + arp_interval); + arp_interval = 0; + } + + if ((miimon == 0) && (arp_interval == 0)) { + /* miimon and arp_interval not set, we need one so things + * work as expected, see bonding.txt for details + */ + printk(KERN_ERR + "bonding_init(): either miimon or " + "arp_interval and arp_ip_target module parameters " + "must be specified, otherwise bonding will not detect " + "link failures! see bonding.txt for details.\n"); + } + + if ((primary != NULL) && (bond_mode != BOND_MODE_ACTIVEBACKUP)){ + /* currently, using a primary only makes sence + * in active backup mode + */ + printk(KERN_WARNING + "bonding_init(): %s primary device specified but has " + " no effect in %s mode\n", + primary, bond_mode_name()); + primary = NULL; + } + + + for (no = 0; no < max_bonds; no++) { + dev_bond->init = bond_init; + + err = dev_alloc_name(dev_bond,"bond%d"); + if (err < 0) { + kfree(dev_bonds); + return err; + } + SET_MODULE_OWNER(dev_bond); + if (register_netdev(dev_bond) != 0) { + kfree(dev_bonds); + return -EIO; + } + dev_bond++; + } + return 0; +} + +static void __exit bonding_exit(void) +{ + struct net_device *dev_bond = dev_bonds; + struct bonding *bond; + int no; + + unregister_netdevice_notifier(&bond_netdev_notifier); + + for (no = 0; no < max_bonds; no++) { + +#ifdef CONFIG_PROC_FS + bond = (struct bonding *) dev_bond->priv; + remove_proc_entry("info", bond->bond_proc_dir); + remove_proc_entry(dev_bond->name, proc_net); +#endif + unregister_netdev(dev_bond); + kfree(bond->stats); + kfree(dev_bond->priv); + + dev_bond->priv = NULL; + dev_bond++; + } + kfree(dev_bonds); +} + +module_init(bonding_init); +module_exit(bonding_exit); +MODULE_LICENSE("GPL"); +MODULE_DESCRIPTION(DRV_DESCRIPTION ", v" DRV_VERSION); + +/* + * Local variables: + * c-indent-level: 8 + * c-basic-offset: 8 + * tab-width: 8 + * End: + */ diff -Nuarp linux-2.4.20-bonding-20030317/drivers/net/bonding/Makefile linux-2.4.20-bonding-20030317-devel/drivers/net/bonding/Makefile --- linux-2.4.20-bonding-20030317/drivers/net/bonding/Makefile 1970-01-01 02:00:00.000000000 +0200 +++ linux-2.4.20-bonding-20030317-devel/drivers/net/bonding/Makefile 2003-03-18 17:03:29.000000000 +0200 @@ -0,0 +1,12 @@ +# +# Makefile for the Ethernet Bonding driver +# + +O_TARGET := bonding.o + +obj-y := bond_main.o + +obj-m := $(O_TARGET) + +include $(TOPDIR)/Rules.make + diff -Nuarp linux-2.4.20-bonding-20030317/drivers/net/bonding.c linux-2.4.20-bonding-20030317-devel/drivers/net/bonding.c --- linux-2.4.20-bonding-20030317/drivers/net/bonding.c 2003-03-18 17:03:29.000000000 +0200 +++ linux-2.4.20-bonding-20030317-devel/drivers/net/bonding.c 1970-01-01 02:00:00.000000000 +0200 @@ -1,3434 +0,0 @@ -/* - * originally based on the dummy device. - * - * Copyright 1999, Thomas Davis, tadavis@lbl.gov. - * Licensed under the GPL. Based on dummy.c, and eql.c devices. - * - * bonding.c: an Ethernet Bonding driver - * - * This is useful to talk to a Cisco EtherChannel compatible equipment: - * Cisco 5500 - * Sun Trunking (Solaris) - * Alteon AceDirector Trunks - * Linux Bonding - * and probably many L2 switches ... - * - * How it works: - * ifconfig bond0 ipaddress netmask up - * will setup a network device, with an ip address. No mac address - * will be assigned at this time. The hw mac address will come from - * the first slave bonded to the channel. All slaves will then use - * this hw mac address. - * - * ifconfig bond0 down - * will release all slaves, marking them as down. - * - * ifenslave bond0 eth0 - * will attach eth0 to bond0 as a slave. eth0 hw mac address will either - * a: be used as initial mac address - * b: if a hw mac address already is there, eth0's hw mac address - * will then be set from bond0. - * - * v0.1 - first working version. - * v0.2 - changed stats to be calculated by summing slaves stats. - * - * Changes: - * Arnaldo Carvalho de Melo - * - fix leaks on failure at bond_init - * - * 2000/09/30 - Willy Tarreau - * - added trivial code to release a slave device. - * - fixed security bug (CAP_NET_ADMIN not checked) - * - implemented MII link monitoring to disable dead links : - * All MII capable slaves are checked every milliseconds - * (100 ms seems good). This value can be changed by passing it to - * insmod. A value of zero disables the monitoring (default). - * - fixed an infinite loop in bond_xmit_roundrobin() when there's no - * good slave. - * - made the code hopefully SMP safe - * - * 2000/10/03 - Willy Tarreau - * - optimized slave lists based on relevant suggestions from Thomas Davis - * - implemented active-backup method to obtain HA with two switches: - * stay as long as possible on the same active interface, while we - * also monitor the backup one (MII link status) because we want to know - * if we are able to switch at any time. ( pass "mode=1" to insmod ) - * - lots of stress testings because we need it to be more robust than the - * wires ! :-> - * - * 2000/10/09 - Willy Tarreau - * - added up and down delays after link state change. - * - optimized the slaves chaining so that when we run forward, we never - * repass through the bond itself, but we can find it by searching - * backwards. Renders the deletion more difficult, but accelerates the - * scan. - * - smarter enslaving and releasing. - * - finer and more robust SMP locking - * - * 2000/10/17 - Willy Tarreau - * - fixed two potential SMP race conditions - * - * 2000/10/18 - Willy Tarreau - * - small fixes to the monitoring FSM in case of zero delays - * 2000/11/01 - Willy Tarreau - * - fixed first slave not automatically used in trunk mode. - * 2000/11/10 : spelling of "EtherChannel" corrected. - * 2000/11/13 : fixed a race condition in case of concurrent accesses to ioctl(). - * 2000/12/16 : fixed improper usage of rtnl_exlock_nowait(). - * - * 2001/1/3 - Chad N. Tindel - * - The bonding driver now simulates MII status monitoring, just like - * a normal network device. It will show that the link is down iff - * every slave in the bond shows that their links are down. If at least - * one slave is up, the bond's MII status will appear as up. - * - * 2001/2/7 - Chad N. Tindel - * - Applications can now query the bond from user space to get - * information which may be useful. They do this by calling - * the BOND_INFO_QUERY ioctl. Once the app knows how many slaves - * are in the bond, it can call the BOND_SLAVE_INFO_QUERY ioctl to - * get slave specific information (# link failures, etc). See - * for more details. The structs of interest - * are ifbond and ifslave. - * - * 2001/4/5 - Chad N. Tindel - * - Ported to 2.4 Kernel - * - * 2001/5/2 - Jeffrey E. Mast - * - When a device is detached from a bond, the slave device is no longer - * left thinking that is has a master. - * - * 2001/5/16 - Jeffrey E. Mast - * - memset did not appropriately initialized the bond rw_locks. Used - * rwlock_init to initialize to unlocked state to prevent deadlock when - * first attempting a lock - * - Called SET_MODULE_OWNER for bond device - * - * 2001/5/17 - Tim Anderson - * - 2 paths for releasing for slave release; 1 through ioctl - * and 2) through close. Both paths need to release the same way. - * - the free slave in bond release is changing slave status before - * the free. The netdev_set_master() is intended to change slave state - * so it should not be done as part of the release process. - * - Simple rule for slave state at release: only the active in A/B and - * only one in the trunked case. - * - * 2001/6/01 - Tim Anderson - * - Now call dev_close when releasing a slave so it doesn't screw up - * out routing table. - * - * 2001/6/01 - Chad N. Tindel - * - Added /proc support for getting bond and slave information. - * Information is in /proc/net//info. - * - Changed the locking when calling bond_close to prevent deadlock. - * - * 2001/8/05 - Janice Girouard - * - correct problem where refcnt of slave is not incremented in bond_ioctl - * so the system hangs when halting. - * - correct locking problem when unable to malloc in bond_enslave. - * - adding bond_xmit_xor logic. - * - adding multiple bond device support. - * - * 2001/8/13 - Erik Habbinga - * - correct locking problem with rtnl_exlock_nowait - * - * 2001/8/23 - Janice Girouard - * - bzero initial dev_bonds, to correct oops - * - convert SIOCDEVPRIVATE to new MII ioctl calls - * - * 2001/9/13 - Takao Indoh - * - Add the BOND_CHANGE_ACTIVE ioctl implementation - * - * 2001/9/14 - Mark Huth - * - Change MII_LINK_READY to not check for end of auto-negotiation, - * but only for an up link. - * - * 2001/9/20 - Chad N. Tindel - * - Add the device field to bonding_t. Previously the net_device - * corresponding to a bond wasn't available from the bonding_t - * structure. - * - * 2001/9/25 - Janice Girouard - * - add arp_monitor for active backup mode - * - * 2001/10/23 - Takao Indoh - * - Various memory leak fixes - * - * 2001/11/5 - Mark Huth - * - Don't take rtnl lock in bond_mii_monitor as it deadlocks under - * certain hotswap conditions. - * Note: this same change may be required in bond_arp_monitor ??? - * - Remove possibility of calling bond_sethwaddr with NULL slave_dev ptr - * - Handle hot swap ethernet interface deregistration events to remove - * kernel oops following hot swap of enslaved interface - * - * 2002/1/2 - Chad N. Tindel - * - Restore original slave flags at release time. - * - * 2002/02/18 - Erik Habbinga - * - bond_release(): calling kfree on our_slave after call to - * bond_restore_slave_flags, not before - * - bond_enslave(): saving slave flags into original_flags before - * call to netdev_set_master, so the IFF_SLAVE flag doesn't end - * up in original_flags - * - * 2002/04/05 - Mark Smith and - * Steve Mead - * - Port Gleb Natapov's multicast support patchs from 2.4.12 - * to 2.4.18 adding support for multicast. - * - * 2002/06/10 - Tony Cureington - * - corrected uninitialized pointer (ifr.ifr_data) in bond_check_dev_link; - * actually changed function to use MIIPHY, then MIIREG, and finally - * ETHTOOL to determine the link status - * - fixed bad ifr_data pointer assignments in bond_ioctl - * - corrected mode 1 being reported as active-backup in bond_get_info; - * also added text to distinguish type of load balancing (rr or xor) - * - change arp_ip_target module param from "1-12s" (array of 12 ptrs) - * to "s" (a single ptr) - * - * 2002/08/30 - Jay Vosburgh - * - Removed acquisition of xmit_lock in set_multicast_list; caused - * deadlock on SMP (lock is held by caller). - * - Revamped SIOCGMIIPHY, SIOCGMIIREG portion of bond_check_dev_link(). - * - * 2002/09/18 - Jay Vosburgh - * - Fixed up bond_check_dev_link() (and callers): removed some magic - * numbers, banished local MII_ defines, wrapped ioctl calls to - * prevent EFAULT errors - * - * 2002/9/30 - Jay Vosburgh - * - make sure the ip target matches the arp_target before saving the - * hw address. - * - * 2002/9/30 - Dan Eisner - * - make sure my_ip is set before taking down the link, since - * not all switches respond if the source ip is not set. - * - * 2002/10/8 - Janice Girouard - * - read in the local ip address when enslaving a device - * - add primary support - * - make sure 2*arp_interval has passed when a new device - * is brought on-line before taking it down. - * - * 2002/09/11 - Philippe De Muyter - * - Added bond_xmit_broadcast logic. - * - Added bond_mode() support function. - * - * 2002/10/26 - Laurent Deniel - * - allow to register multicast addresses only on active slave - * (useful in active-backup mode) - * - add multicast module parameter - * - fix deletion of multicast groups after unloading module - * - * 2002/11/06 - Kameshwara Rayaprolu - * - Changes to prevent panic from closing the device twice; if we close - * the device in bond_release, we must set the original_flags to down - * so it won't be closed again by the network layer. - * - * 2002/11/07 - Tony Cureington - * - Fix arp_target_hw_addr memory leak - * - Created activebackup_arp_monitor function to handle arp monitoring - * in active backup mode - the bond_arp_monitor had several problems... - * such as allowing slaves to tx arps sequentially without any delay - * for a response - * - Renamed bond_arp_monitor to loadbalance_arp_monitor and re-wrote - * this function to just handle arp monitoring in load-balancing mode; - * it is a lot more compact now - * - Changes to ensure one and only one slave transmits in active-backup - * mode - * - Robustesize parameters; warn users about bad combinations of - * parameters; also if miimon is specified and a network driver does - * not support MII or ETHTOOL, inform the user of this - * - Changes to support link_failure_count when in arp monitoring mode - * - Fix up/down delay reported in /proc - * - Added version; log version; make version available from "modinfo -d" - * - Fixed problem in bond_check_dev_link - if the first IOCTL (SIOCGMIIPH) - * failed, the ETHTOOL ioctl never got a chance - * - * 2002/11/16 - Laurent Deniel - * - fix multicast handling in activebackup_arp_monitor - * - remove one unnecessary and confusing current_slave == slave test - * in activebackup_arp_monitor - * - * 2002/11/17 - Laurent Deniel - * - fix bond_slave_info_query when slave_id = num_slaves - * - * 2002/11/19 - Janice Girouard - * - correct ifr_data reference. Update ifr_data reference - * to mii_ioctl_data struct values to avoid confusion. - * - * 2002/11/22 - Bert Barbe - * - Add support for multiple arp_ip_target - * - * 2002/12/13 - Jay Vosburgh - * - Changed to allow text strings for mode and multicast, e.g., - * insmod bonding mode=active-backup. The numbers still work. - * One change: an invalid choice will cause module load failure, - * rather than the previous behavior of just picking one. - * - Minor cleanups; got rid of dup ctype stuff, atoi function - * - * 2003/02/07 - Jay Vosburgh - * - Added use_carrier module parameter that causes miimon to - * use netif_carrier_ok() test instead of MII/ETHTOOL ioctls. - * - Minor cleanups; consolidated ioctl calls to one function. - * - * 2003/02/07 - Tony Cureington - * - Fix bond_mii_monitor() logic error that could result in - * bonding round-robin mode ignoring links after failover/recovery - * - * 2003/03/17 - Jay Vosburgh - * - kmalloc fix (GFP_KERNEL to GFP_ATOMIC) reported by - * Shmulik dot Hen at intel.com. - * - Based on discussion on mailing list, changed use of - * update_slave_cnt(), created wrapper functions for adding/removing - * slaves, changed bond_xmit_xor() to check slave_cnt instead of - * checking slave and slave->dev (which only worked by accident). - * - Misc code cleanup: get arp_send() prototype from header file, - * add max_bonds to bonding.txt. - * - * 2003/03/18 - Tsippy Mendelson and - * Shmulik Hen - * - Make sure only bond_attach_slave() and bond_detach_slave() can - * manipulate the slave list, including slave_cnt, even when in - * bond_release_all(). - * - Fixed hang in bond_release() while traffic is running. - * netdev_set_master() must not be called from within the bond lock. - * - * 2003/03/18 - Tsippy Mendelson and - * Shmulik Hen - * - Fixed hang in bond_enslave(): netdev_set_master() must not be - * called from within the bond lock while traffic is running. - * - * 2003/03/18 - Amir Noam - * - Added support for getting slave's speed and duplex via ethtool. - * Needed for 802.3ad and other future modes. - * - * 2003/03/18 - Tsippy Mendelson and - * Shmulik Hen - * - Enable support of modes that need to use the unique mac address of - * each slave. - * * bond_enslave(): Moved setting the slave's mac address, and - * openning it, from the application to the driver. This breaks - * backward comaptibility with old versions of ifenslave that open - * the slave before enalsving it !!!. - * * bond_release(): The driver also takes care of closing the slave - * and restoring its original mac address. - * - Removed the code that restores all base driver's flags. - * Flags are automatically restored once all undo stages are done - * properly. - * - Block possibility of enslaving before the master is up. This - * prevents putting the system in an unstable state. - */ - -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include - -#include -#include -#include -#include -#include -#include - -#include -#include -#include -#include -#include -#include - -#define DRV_VERSION "2.4.20-20030317" -#define DRV_RELDATE "March 17, 2003" -#define DRV_NAME "bonding" -#define DRV_DESCRIPTION "Ethernet Channel Bonding Driver" - -static const char *version = -DRV_NAME ".c:v" DRV_VERSION " (" DRV_RELDATE ")\n"; - -/* monitor all links that often (in milliseconds). <=0 disables monitoring */ -#ifndef BOND_LINK_MON_INTERV -#define BOND_LINK_MON_INTERV 0 -#endif - -#ifndef BOND_LINK_ARP_INTERV -#define BOND_LINK_ARP_INTERV 0 -#endif - -#ifndef MAX_ARP_IP_TARGETS -#define MAX_ARP_IP_TARGETS 16 -#endif - -static int arp_interval = BOND_LINK_ARP_INTERV; -static char *arp_ip_target[MAX_ARP_IP_TARGETS] = { NULL, }; -static unsigned long arp_target[MAX_ARP_IP_TARGETS] = { 0, } ; -static int arp_ip_count = 0; -static u32 my_ip = 0; -char *arp_target_hw_addr = NULL; - -static char *primary= NULL; - -static int max_bonds = BOND_DEFAULT_MAX_BONDS; -static int miimon = BOND_LINK_MON_INTERV; -static int use_carrier = 1; -static int bond_mode = BOND_MODE_ROUNDROBIN; -static int updelay = 0; -static int downdelay = 0; - -static char *mode = NULL; - -static struct bond_parm_tbl bond_mode_tbl[] = { -{ "balance-rr", BOND_MODE_ROUNDROBIN}, -{ "active-backup", BOND_MODE_ACTIVEBACKUP}, -{ "balance-xor", BOND_MODE_XOR}, -{ "broadcast", BOND_MODE_BROADCAST}, -{ NULL, -1}, -}; - -static int multicast_mode = BOND_MULTICAST_ALL; -static char *multicast = NULL; - -static struct bond_parm_tbl bond_mc_tbl[] = { -{ "disabled", BOND_MULTICAST_DISABLED}, -{ "active", BOND_MULTICAST_ACTIVE}, -{ "all", BOND_MULTICAST_ALL}, -{ NULL, -1}, -}; - -static int first_pass = 1; -static struct bonding *these_bonds = NULL; -static struct net_device *dev_bonds = NULL; - -MODULE_PARM(max_bonds, "i"); -MODULE_PARM_DESC(max_bonds, "Max number of bonded devices"); -MODULE_PARM(miimon, "i"); -MODULE_PARM_DESC(miimon, "Link check interval in milliseconds"); -MODULE_PARM(use_carrier, "i"); -MODULE_PARM_DESC(use_carrier, "Use netif_carrier_ok (vs MII ioctls) in miimon; 09 for off, 1 for on (default)"); -MODULE_PARM(mode, "s"); -MODULE_PARM_DESC(mode, "Mode of operation : 0 for round robin, 1 for active-backup, 2 for xor"); -MODULE_PARM(arp_interval, "i"); -MODULE_PARM_DESC(arp_interval, "arp interval in milliseconds"); -MODULE_PARM(arp_ip_target, "1-" __MODULE_STRING(MAX_ARP_IP_TARGETS) "s"); -MODULE_PARM_DESC(arp_ip_target, "arp targets in n.n.n.n form"); -MODULE_PARM(updelay, "i"); -MODULE_PARM_DESC(updelay, "Delay before considering link up, in milliseconds"); -MODULE_PARM(downdelay, "i"); -MODULE_PARM_DESC(downdelay, "Delay before considering link down, in milliseconds"); -MODULE_PARM(primary, "s"); -MODULE_PARM_DESC(primary, "Primary network device to use"); -MODULE_PARM(multicast, "s"); -MODULE_PARM_DESC(multicast, "Mode for multicast support : 0 for none, 1 for active slave, 2 for all slaves (default)"); - -static int bond_xmit_roundrobin(struct sk_buff *skb, struct net_device *dev); -static int bond_xmit_xor(struct sk_buff *skb, struct net_device *dev); -static int bond_xmit_activebackup(struct sk_buff *skb, struct net_device *dev); -static struct net_device_stats *bond_get_stats(struct net_device *dev); -static void bond_mii_monitor(struct net_device *dev); -static void loadbalance_arp_monitor(struct net_device *dev); -static void activebackup_arp_monitor(struct net_device *dev); -static int bond_event(struct notifier_block *this, unsigned long event, void *ptr); -static void bond_mc_list_destroy(struct bonding *bond); -static void bond_mc_add(bonding_t *bond, void *addr, int alen); -static void bond_mc_delete(bonding_t *bond, void *addr, int alen); -static int bond_mc_list_copy (struct dev_mc_list *src, struct bonding *dst, int gpf_flag); -static inline int dmi_same(struct dev_mc_list *dmi1, struct dev_mc_list *dmi2); -static void bond_set_promiscuity(bonding_t *bond, int inc); -static void bond_set_allmulti(bonding_t *bond, int inc); -static struct dev_mc_list* bond_mc_list_find_dmi(struct dev_mc_list *dmi, struct dev_mc_list *mc_list); -static void bond_mc_update(bonding_t *bond, slave_t *new, slave_t *old); -static void bond_set_slave_inactive_flags(slave_t *slave); -static void bond_set_slave_active_flags(slave_t *slave); -static int bond_enslave(struct net_device *master, struct net_device *slave); -static int bond_release(struct net_device *master, struct net_device *slave); -static int bond_release_all(struct net_device *master); -static int bond_sethwaddr(struct net_device *master, struct net_device *slave); - -/* - * bond_get_info is the interface into the /proc filesystem. This is - * a different interface than the BOND_INFO_QUERY ioctl. That is done - * through the generic networking ioctl interface, and bond_info_query - * is the internal function which provides that information. - */ -static int bond_get_info(char *buf, char **start, off_t offset, int length); - -/* #define BONDING_DEBUG 1 */ - -/* several macros */ - -#define IS_UP(dev) ((((dev)->flags & (IFF_UP)) == (IFF_UP)) && \ - (netif_running(dev) && netif_carrier_ok(dev))) - -static void arp_send_all(slave_t *slave) -{ - int i; - - for (i = 0; (idev, - my_ip, arp_target_hw_addr, slave->dev->dev_addr, - arp_target_hw_addr); - } -} - - -static const char * -bond_mode_name(void) -{ - switch (bond_mode) { - case BOND_MODE_ROUNDROBIN : - return "load balancing (round-robin)"; - case BOND_MODE_ACTIVEBACKUP : - return "fault-tolerance (active-backup)"; - case BOND_MODE_XOR : - return "load balancing (xor)"; - case BOND_MODE_BROADCAST : - return "fault-tolerance (broadcast)"; - default : - return "unknown"; - } -} - -static const char * -multicast_mode_name(void) -{ - switch(multicast_mode) { - case BOND_MULTICAST_DISABLED : - return "disabled"; - case BOND_MULTICAST_ACTIVE : - return "active slave only"; - case BOND_MULTICAST_ALL : - return "all slaves"; - default : - return "unknown"; - } -} - -static void bond_set_slave_inactive_flags(slave_t *slave) -{ - slave->state = BOND_STATE_BACKUP; - slave->dev->flags |= IFF_NOARP; -} - -static void bond_set_slave_active_flags(slave_t *slave) -{ - slave->state = BOND_STATE_ACTIVE; - slave->dev->flags &= ~IFF_NOARP; -} - -/* - * This function counts and verifies the the number of attached - * slaves, checking the count against the expected value (given that incr - * is either 1 or -1, for add or removal of a slave). Only - * bond_xmit_xor() uses the slave_cnt value, but this is still a good - * consistency check. - */ -static inline void -update_slave_cnt(bonding_t *bond, int incr) -{ - slave_t *slave = NULL; - int expect = bond->slave_cnt + incr; - - bond->slave_cnt = 0; - for (slave = bond->prev; slave != (slave_t*)bond; - slave = slave->prev) { - bond->slave_cnt++; - } - - if (expect != bond->slave_cnt) - BUG(); -} - -/* - * This function detaches the slave from the list . - * WARNING: no check is made to verify if the slave effectively - * belongs to . It returns in case it's needed. - * Nothing is freed on return, structures are just unchained. - * If the bond->current_slave pointer was pointing to , - * it's replaced with slave->next, or if not applicable. - * - * bond->lock held by caller. - */ -static slave_t * -bond_detach_slave(bonding_t *bond, slave_t *slave) -{ - if ((bond == NULL) || (slave == NULL) || - ((void *)bond == (void *)slave)) { - printk(KERN_ERR - "bond_detach_slave(): trying to detach " - "slave %p from bond %p\n", bond, slave); - return slave; - } - - if (bond->next == slave) { /* is the slave at the head ? */ - if (bond->prev == slave) { /* is the slave alone ? */ - write_lock(&bond->ptrlock); - bond->current_slave = NULL; /* no slave anymore */ - write_unlock(&bond->ptrlock); - bond->prev = bond->next = (slave_t *)bond; - } else { /* not alone */ - bond->next = slave->next; - slave->next->prev = (slave_t *)bond; - bond->prev->next = slave->next; - - write_lock(&bond->ptrlock); - if (bond->current_slave == slave) { - bond->current_slave = slave->next; - } - write_unlock(&bond->ptrlock); - } - } else { - slave->prev->next = slave->next; - if (bond->prev == slave) { /* is this slave the last one ? */ - bond->prev = slave->prev; - } else { - slave->next->prev = slave->prev; - } - - write_lock(&bond->ptrlock); - if (bond->current_slave == slave) { - bond->current_slave = slave->next; - } - write_unlock(&bond->ptrlock); - } - - update_slave_cnt(bond, -1); - - return slave; -} - -static void -bond_attach_slave(struct bonding *bond, struct slave *new_slave) -{ - /* - * queue to the end of the slaves list, make the first element its - * successor, the last one its predecessor, and make it the bond's - * predecessor. - * - * Just to clarify, so future bonding driver hackers don't go through - * the same confusion stage I did trying to figure this out, the - * slaves are stored in a double linked circular list, sortof. - * In the ->next direction, the last slave points to the first slave, - * bypassing bond; only the slaves are in the ->next direction. - * In the ->prev direction, however, the first slave points to bond - * and bond points to the last slave. - * - * It looks like a circle with a little bubble hanging off one side - * in the ->prev direction only. - * - * When going through the list once, its best to start at bond->prev - * and go in the ->prev direction, testing for bond. Doing this - * in the ->next direction doesn't work. Trust me, I know this now. - * :) -mts 2002.03.14 - */ - new_slave->prev = bond->prev; - new_slave->prev->next = new_slave; - bond->prev = new_slave; - new_slave->next = bond->next; - - update_slave_cnt(bond, 1); -} - - -/* - * Less bad way to call ioctl from within the kernel; this needs to be - * done some other way to get the call out of interrupt context. - * Needs "ioctl" variable to be supplied by calling context. - */ -#define IOCTL(dev, arg, cmd) ({ \ - int ret; \ - mm_segment_t fs = get_fs(); \ - set_fs(get_ds()); \ - ret = ioctl(dev, arg, cmd); \ - set_fs(fs); \ - ret; }) - -/* - * Get link speed and duplex from the slave's base driver - * using ethtool. If for some reason the call fails or the - * values are invalid, fake speed and duplex to 100/Full - * and return error. - */ -static int bond_update_speed_duplex(struct slave *slave) -{ - struct net_device *dev = slave->dev; - static int (* ioctl)(struct net_device *, struct ifreq *, int); - struct ifreq ifr; - struct ethtool_cmd etool; - - ioctl = dev->do_ioctl; - if (ioctl) { - etool.cmd = ETHTOOL_GSET; - ifr.ifr_data = (char*)&etool; - if (IOCTL(dev, &ifr, SIOCETHTOOL) == 0) { - slave->speed = etool.speed; - slave->duplex = etool.duplex; - } else { - goto err_out; - } - } else { - goto err_out; - } - - switch (slave->speed) { - case SPEED_10: - case SPEED_100: - case SPEED_1000: - break; - default: - goto err_out; - } - - switch (slave->duplex) { - case DUPLEX_FULL: - case DUPLEX_HALF: - break; - default: - goto err_out; - } - - return 0; - -err_out: - //Fake speed and duplex - slave->speed = SPEED_100; - slave->duplex = DUPLEX_FULL; - return -1; -} - -/* - * if supports MII link status reporting, check its link status. - * - * We either do MII/ETHTOOL ioctls, or check netif_carrier_ok(), - * depening upon the setting of the use_carrier parameter. - * - * Return either BMSR_LSTATUS, meaning that the link is up (or we - * can't tell and just pretend it is), or 0, meaning that the link is - * down. - * - * If reporting is non-zero, instead of faking link up, return -1 if - * both ETHTOOL and MII ioctls fail (meaning the device does not - * support them). If use_carrier is set, return whatever it says. - * It'd be nice if there was a good way to tell if a driver supports - * netif_carrier, but there really isn't. - */ -static int -bond_check_dev_link(struct net_device *dev, int reporting) -{ - static int (* ioctl)(struct net_device *, struct ifreq *, int); - struct ifreq ifr; - struct mii_ioctl_data *mii; - struct ethtool_value etool; - - if (use_carrier) { - return netif_carrier_ok(dev) ? BMSR_LSTATUS : 0; - } - - ioctl = dev->do_ioctl; - if (ioctl) { - /* TODO: set pointer to correct ioctl on a per team member */ - /* bases to make this more efficient. that is, once */ - /* we determine the correct ioctl, we will always */ - /* call it and not the others for that team */ - /* member. */ - - /* - * We cannot assume that SIOCGMIIPHY will also read a - * register; not all network drivers (e.g., e100) - * support that. - */ - - /* Yes, the mii is overlaid on the ifreq.ifr_ifru */ - mii = (struct mii_ioctl_data *)&ifr.ifr_data; - if (IOCTL(dev, &ifr, SIOCGMIIPHY) == 0) { - mii->reg_num = MII_BMSR; - if (IOCTL(dev, &ifr, SIOCGMIIREG) == 0) { - return mii->val_out & BMSR_LSTATUS; - } - } - - /* try SIOCETHTOOL ioctl, some drivers cache ETHTOOL_GLINK */ - /* for a period of time so we attempt to get link status */ - /* from it last if the above MII ioctls fail... */ - etool.cmd = ETHTOOL_GLINK; - ifr.ifr_data = (char*)&etool; - if (IOCTL(dev, &ifr, SIOCETHTOOL) == 0) { - if (etool.data == 1) { - return BMSR_LSTATUS; - } else { -#ifdef BONDING_DEBUG - printk(KERN_INFO - ":: SIOCETHTOOL shows link down \n"); -#endif - return 0; - } - } - - } - - /* - * If reporting, report that either there's no dev->do_ioctl, - * or both SIOCGMIIREG and SIOCETHTOOL failed (meaning that we - * cannot report link status). If not reporting, pretend - * we're ok. - */ - return reporting ? -1 : BMSR_LSTATUS; -} - -static u16 bond_check_mii_link(bonding_t *bond) -{ - int has_active_interface = 0; - unsigned long flags; - - read_lock_irqsave(&bond->lock, flags); - read_lock(&bond->ptrlock); - has_active_interface = (bond->current_slave != NULL); - read_unlock(&bond->ptrlock); - read_unlock_irqrestore(&bond->lock, flags); - - return (has_active_interface ? BMSR_LSTATUS : 0); -} - -static int bond_open(struct net_device *dev) -{ - struct timer_list *timer = &((struct bonding *)(dev->priv))->mii_timer; - struct timer_list *arp_timer = &((struct bonding *)(dev->priv))->arp_timer; - MOD_INC_USE_COUNT; - - if (miimon > 0) { /* link check interval, in milliseconds. */ - init_timer(timer); - timer->expires = jiffies + (miimon * HZ / 1000); - timer->data = (unsigned long)dev; - timer->function = (void *)&bond_mii_monitor; - add_timer(timer); - } - - if (arp_interval> 0) { /* arp interval, in milliseconds. */ - init_timer(arp_timer); - arp_timer->expires = jiffies + (arp_interval * HZ / 1000); - arp_timer->data = (unsigned long)dev; - if (bond_mode == BOND_MODE_ACTIVEBACKUP) { - arp_timer->function = (void *)&activebackup_arp_monitor; - } else { - arp_timer->function = (void *)&loadbalance_arp_monitor; - } - add_timer(arp_timer); - } - return 0; -} - -static int bond_close(struct net_device *master) -{ - bonding_t *bond = (struct bonding *) master->priv; - unsigned long flags; - - write_lock_irqsave(&bond->lock, flags); - - if (miimon > 0) { /* link check interval, in milliseconds. */ - del_timer(&bond->mii_timer); - } - if (arp_interval> 0) { /* arp interval, in milliseconds. */ - del_timer(&bond->arp_timer); - if (arp_target_hw_addr != NULL) { - kfree(arp_target_hw_addr); - arp_target_hw_addr = NULL; - } - } - - /* Release the bonded slaves */ - bond_release_all(master); - bond_mc_list_destroy (bond); - - write_unlock_irqrestore(&bond->lock, flags); - - MOD_DEC_USE_COUNT; - return 0; -} - -/* - * flush all members of flush->mc_list from device dev->mc_list - */ -static void bond_mc_list_flush(struct net_device *dev, struct net_device *flush) -{ - struct dev_mc_list *dmi; - - for (dmi = flush->mc_list; dmi != NULL; dmi = dmi->next) - dev_mc_delete(dev, dmi->dmi_addr, dmi->dmi_addrlen, 0); -} - -/* - * Totally destroys the mc_list in bond - */ -static void bond_mc_list_destroy(struct bonding *bond) -{ - struct dev_mc_list *dmi; - - dmi = bond->mc_list; - while (dmi) { - bond->mc_list = dmi->next; - kfree(dmi); - dmi = bond->mc_list; - } -} - -/* - * Add a Multicast address to every slave in the bonding group - */ -static void bond_mc_add(bonding_t *bond, void *addr, int alen) -{ - slave_t *slave; - switch (multicast_mode) { - case BOND_MULTICAST_ACTIVE : - /* write lock already acquired */ - if (bond->current_slave != NULL) - dev_mc_add(bond->current_slave->dev, addr, alen, 0); - break; - case BOND_MULTICAST_ALL : - for (slave = bond->prev; slave != (slave_t*)bond; slave = slave->prev) - dev_mc_add(slave->dev, addr, alen, 0); - break; - case BOND_MULTICAST_DISABLED : - break; - } -} - -/* - * Remove a multicast address from every slave in the bonding group - */ -static void bond_mc_delete(bonding_t *bond, void *addr, int alen) -{ - slave_t *slave; - switch (multicast_mode) { - case BOND_MULTICAST_ACTIVE : - /* write lock already acquired */ - if (bond->current_slave != NULL) - dev_mc_delete(bond->current_slave->dev, addr, alen, 0); - break; - case BOND_MULTICAST_ALL : - for (slave = bond->prev; slave != (slave_t*)bond; slave = slave->prev) - dev_mc_delete(slave->dev, addr, alen, 0); - break; - case BOND_MULTICAST_DISABLED : - break; - } -} - -/* - * Copy all the Multicast addresses from src to the bonding device dst - */ -static int bond_mc_list_copy (struct dev_mc_list *src, struct bonding *dst, - int gpf_flag) -{ - struct dev_mc_list *dmi, *new_dmi; - - for (dmi = src; dmi != NULL; dmi = dmi->next) { - new_dmi = kmalloc(sizeof(struct dev_mc_list), gpf_flag); - - if (new_dmi == NULL) { - return -ENOMEM; - } - - new_dmi->next = dst->mc_list; - dst->mc_list = new_dmi; - - new_dmi->dmi_addrlen = dmi->dmi_addrlen; - memcpy(new_dmi->dmi_addr, dmi->dmi_addr, dmi->dmi_addrlen); - new_dmi->dmi_users = dmi->dmi_users; - new_dmi->dmi_gusers = dmi->dmi_gusers; - } - return 0; -} - -/* - * Returns 0 if dmi1 and dmi2 are the same, non-0 otherwise - */ -static inline int dmi_same(struct dev_mc_list *dmi1, struct dev_mc_list *dmi2) -{ - return memcmp(dmi1->dmi_addr, dmi2->dmi_addr, dmi1->dmi_addrlen) == 0 && - dmi1->dmi_addrlen == dmi2->dmi_addrlen; -} - -/* - * Push the promiscuity flag down to all slaves - */ -static void bond_set_promiscuity(bonding_t *bond, int inc) -{ - slave_t *slave; - switch (multicast_mode) { - case BOND_MULTICAST_ACTIVE : - /* write lock already acquired */ - if (bond->current_slave != NULL) - dev_set_promiscuity(bond->current_slave->dev, inc); - break; - case BOND_MULTICAST_ALL : - for (slave = bond->prev; slave != (slave_t*)bond; slave = slave->prev) - dev_set_promiscuity(slave->dev, inc); - break; - case BOND_MULTICAST_DISABLED : - break; - } -} - -/* - * Push the allmulti flag down to all slaves - */ -static void bond_set_allmulti(bonding_t *bond, int inc) -{ - slave_t *slave; - switch (multicast_mode) { - case BOND_MULTICAST_ACTIVE : - /* write lock already acquired */ - if (bond->current_slave != NULL) - dev_set_allmulti(bond->current_slave->dev, inc); - break; - case BOND_MULTICAST_ALL : - for (slave = bond->prev; slave != (slave_t*)bond; slave = slave->prev) - dev_set_allmulti(slave->dev, inc); - break; - case BOND_MULTICAST_DISABLED : - break; - } -} - -/* - * returns dmi entry if found, NULL otherwise - */ -static struct dev_mc_list* bond_mc_list_find_dmi(struct dev_mc_list *dmi, - struct dev_mc_list *mc_list) -{ - struct dev_mc_list *idmi; - - for (idmi = mc_list; idmi != NULL; idmi = idmi->next) { - if (dmi_same(dmi, idmi)) { - return idmi; - } - } - return NULL; -} - -static void set_multicast_list(struct net_device *master) -{ - bonding_t *bond = master->priv; - struct dev_mc_list *dmi; - unsigned long flags = 0; - - if (multicast_mode == BOND_MULTICAST_DISABLED) - return; - /* - * Lock the private data for the master - */ - write_lock_irqsave(&bond->lock, flags); - - /* set promiscuity flag to slaves */ - if ( (master->flags & IFF_PROMISC) && !(bond->flags & IFF_PROMISC) ) - bond_set_promiscuity(bond, 1); - - if ( !(master->flags & IFF_PROMISC) && (bond->flags & IFF_PROMISC) ) - bond_set_promiscuity(bond, -1); - - /* set allmulti flag to slaves */ - if ( (master->flags & IFF_ALLMULTI) && !(bond->flags & IFF_ALLMULTI) ) - bond_set_allmulti(bond, 1); - - if ( !(master->flags & IFF_ALLMULTI) && (bond->flags & IFF_ALLMULTI) ) - bond_set_allmulti(bond, -1); - - bond->flags = master->flags; - - /* looking for addresses to add to slaves' mc list */ - for (dmi = master->mc_list; dmi != NULL; dmi = dmi->next) { - if (bond_mc_list_find_dmi(dmi, bond->mc_list) == NULL) - bond_mc_add(bond, dmi->dmi_addr, dmi->dmi_addrlen); - } - - /* looking for addresses to delete from slaves' list */ - for (dmi = bond->mc_list; dmi != NULL; dmi = dmi->next) { - if (bond_mc_list_find_dmi(dmi, master->mc_list) == NULL) - bond_mc_delete(bond, dmi->dmi_addr, dmi->dmi_addrlen); - } - - - /* save master's multicast list */ - bond_mc_list_destroy (bond); - bond_mc_list_copy (master->mc_list, bond, GFP_ATOMIC); - - write_unlock_irqrestore(&bond->lock, flags); -} - -/* - * Update the mc list and multicast-related flags for the new and - * old active slaves (if any) according to the multicast mode - */ -static void bond_mc_update(bonding_t *bond, slave_t *new, slave_t *old) -{ - struct dev_mc_list *dmi; - - switch(multicast_mode) { - case BOND_MULTICAST_ACTIVE : - if (bond->device->flags & IFF_PROMISC) { - if (old != NULL && new != old) - dev_set_promiscuity(old->dev, -1); - dev_set_promiscuity(new->dev, 1); - } - if (bond->device->flags & IFF_ALLMULTI) { - if (old != NULL && new != old) - dev_set_allmulti(old->dev, -1); - dev_set_allmulti(new->dev, 1); - } - /* first remove all mc addresses from old slave if any, - and _then_ add them to new active slave */ - if (old != NULL && new != old) { - for (dmi = bond->device->mc_list; dmi != NULL; dmi = dmi->next) - dev_mc_delete(old->dev, dmi->dmi_addr, dmi->dmi_addrlen, 0); - } - for (dmi = bond->device->mc_list; dmi != NULL; dmi = dmi->next) - dev_mc_add(new->dev, dmi->dmi_addr, dmi->dmi_addrlen, 0); - break; - case BOND_MULTICAST_ALL : - /* nothing to do: mc list is already up-to-date on all slaves */ - break; - case BOND_MULTICAST_DISABLED : - break; - } -} - -/* enslave device to bond device */ -static int bond_enslave(struct net_device *master_dev, - struct net_device *slave_dev) -{ - bonding_t *bond = NULL; - slave_t *new_slave = NULL; - unsigned long flags = 0; - unsigned long rflags = 0; - int err = 0; - struct dev_mc_list *dmi; - struct in_ifaddr **ifap; - struct in_ifaddr *ifa; - int link_reporting; - struct sockaddr addr; - - if (master_dev == NULL || slave_dev == NULL) { - return -ENODEV; - } - bond = (struct bonding *) master_dev->priv; - - if (slave_dev->do_ioctl == NULL) { - printk(KERN_DEBUG - "Warning : no link monitoring support for %s\n", - slave_dev->name); - } - - /* This breaks backward comaptibility with old versions - of ifenslave which open the slave before enalsving */ - /* already up. */ - if ((slave_dev->flags & IFF_UP) == IFF_UP) { -#ifdef BONDING_DEBUG - printk(KERN_CRIT "Error, slave_dev is up\n"); -#endif - return -EBUSY; - } - - /* already enslaved */ - if (master_dev->flags & IFF_SLAVE || slave_dev->flags & IFF_SLAVE) { -#ifdef BONDING_DEBUG - printk(KERN_CRIT "Error, Device was already enslaved\n"); -#endif - return -EBUSY; - } - - /* bond must be initialize by bond_open() before enslaving */ - if ((master_dev->flags & IFF_UP) != IFF_UP) { -#ifdef BONDING_DEBUG - printk(KERN_CRIT "Error, master_dev is not up\n"); -#endif - return -EPERM; - } - - if (slave_dev->set_mac_address == NULL) { - printk(KERN_CRIT " The slave device you specified does not support" - " setting the MAC address.\n Your kernel likely does not" - " support slave devices.\n"); - return -EOPNOTSUPP; - } - - if ((new_slave = kmalloc(sizeof(slave_t), GFP_ATOMIC)) == NULL) { - return -ENOMEM; - } - memset(new_slave, 0, sizeof(slave_t)); - - /* save slave's original flags before calling */ - /* netdev_set_master and dev_open */ - new_slave->original_flags = slave_dev->flags; - - /* save slave's original ("permanent") mac address for - modes that needs it, and for restoring it upon release, - and then set it to the master's address */ - memcpy(new_slave->perm_hwaddr, slave_dev->dev_addr, ETH_ALEN); - - if (bond->next != (slave_t*)bond) { - /* set slave to master's mac address - The application already set the master's - mac address to that of the first slave */ - memcpy(addr.sa_data, master_dev->dev_addr, ETH_ALEN); - addr.sa_family = slave_dev->type; - err = slave_dev->set_mac_address(slave_dev, &addr); - if (err) { -#ifdef BONDING_DEBUG - printk(KERN_CRIT "Error %d calling set_mac_address\n", err); -#endif - goto err_free; - } - } - - /* open the slave since the application closed it */ - err = dev_open(slave_dev); - if (err) { -#ifdef BONDING_DEBUG - printk(KERN_CRIT "Openning slave %s failed\n", slave_dev->name); -#endif - goto err_restore_mac; - } - - err = netdev_set_master(slave_dev, master_dev); - - if (err) { -#ifdef BONDING_DEBUG - printk(KERN_CRIT "Error %d calling netdev_set_master\n", err); -#endif - goto err_close; - } - - new_slave->dev = slave_dev; - - if (multicast_mode == BOND_MULTICAST_ALL) { - /* set promiscuity level to new slave */ - if (master_dev->flags & IFF_PROMISC) - dev_set_promiscuity(slave_dev, 1); - - /* set allmulti level to new slave */ - if (master_dev->flags & IFF_ALLMULTI) - dev_set_allmulti(slave_dev, 1); - - /* upload master's mc_list to new slave */ - for (dmi = master_dev->mc_list; dmi != NULL; dmi = dmi->next) - dev_mc_add (slave_dev, dmi->dmi_addr, dmi->dmi_addrlen, 0); - } - - write_lock_irqsave(&bond->lock, flags); - - bond_attach_slave(bond, new_slave); - new_slave->delay = 0; - new_slave->link_failure_count = 0; - - if (miimon > 0 && !use_carrier) { - link_reporting = bond_check_dev_link(slave_dev, 1); - - if ((link_reporting == -1) && (arp_interval == 0)) { - /* - * miimon is set but a bonded network driver - * does not support ETHTOOL/MII and - * arp_interval is not set. Note: if - * use_carrier is enabled, we will never go - * here (because netif_carrier is always - * supported); thus, we don't need to change - * the messages for netif_carrier. - */ - printk(KERN_ERR - "bond_enslave(): MII and ETHTOOL support not " - "available for interface %s, and " - "arp_interval/arp_ip_target module parameters " - "not specified, thus bonding will not detect " - "link failures! see bonding.txt for details.\n", - slave_dev->name); - } else if (link_reporting == -1) { - /* unable get link status using mii/ethtool */ - printk(KERN_WARNING - "bond_enslave: can't get link status from " - "interface %s; the network driver associated " - "with this interface does not support " - "MII or ETHTOOL link status reporting, thus " - "miimon has no effect on this interface.\n", - slave_dev->name); - } - } - - /* check for initial state */ - if ((miimon <= 0) || - (bond_check_dev_link(slave_dev, 0) == BMSR_LSTATUS)) { -#ifdef BONDING_DEBUG - printk(KERN_CRIT "Initial state of slave_dev is BOND_LINK_UP\n"); -#endif - new_slave->link = BOND_LINK_UP; - new_slave->jiffies = jiffies; - } - else { -#ifdef BONDING_DEBUG - printk(KERN_CRIT "Initial state of slave_dev is BOND_LINK_DOWN\n"); -#endif - new_slave->link = BOND_LINK_DOWN; - } - - if (bond_update_speed_duplex(new_slave) && (new_slave->link == BOND_LINK_UP) ) { - printk(KERN_WARNING - "bond_enslave(): failed to get speed/duplex from %s, " - "speed forced to 100Mbps, duplex forced to Full.\n", - new_slave->dev->name); - } - - /* if we're in active-backup mode, we need one and only one active - * interface. The backup interfaces will have their NOARP flag set - * because we need them to be completely deaf and not to respond to - * any ARP request on the network to avoid fooling a switch. Thus, - * since we guarantee that current_slave always point to the last - * usable interface, we just have to verify this interface's flag. - */ - if (bond_mode == BOND_MODE_ACTIVEBACKUP) { - if (((bond->current_slave == NULL) - || (bond->current_slave->dev->flags & IFF_NOARP)) - && (new_slave->link == BOND_LINK_UP)) { -#ifdef BONDING_DEBUG - printk(KERN_CRIT "This is the first active slave\n"); -#endif - /* first slave or no active slave yet, and this link - is OK, so make this interface the active one */ - bond->current_slave = new_slave; - bond_set_slave_active_flags(new_slave); - bond_mc_update(bond, new_slave, NULL); - } - else { -#ifdef BONDING_DEBUG - printk(KERN_CRIT "This is just a backup slave\n"); -#endif - bond_set_slave_inactive_flags(new_slave); - } - read_lock_irqsave(&(((struct in_device *)slave_dev->ip_ptr)->lock), rflags); - ifap= &(((struct in_device *)slave_dev->ip_ptr)->ifa_list); - ifa = *ifap; - my_ip = ifa->ifa_address; - read_unlock_irqrestore(&(((struct in_device *)slave_dev->ip_ptr)->lock), rflags); - - /* if there is a primary slave, remember it */ - if (primary != NULL) - if( strcmp(primary, new_slave->dev->name) == 0) - bond->primary_slave = new_slave; - } else { -#ifdef BONDING_DEBUG - printk(KERN_CRIT "This slave is always active in trunk mode\n"); -#endif - /* always active in trunk mode */ - new_slave->state = BOND_STATE_ACTIVE; - if (bond->current_slave == NULL) - bond->current_slave = new_slave; - } - - write_unlock_irqrestore(&bond->lock, flags); - - printk (KERN_INFO "%s: enslaving %s as a%s interface with a%s link.\n", - master_dev->name, slave_dev->name, - new_slave->state == BOND_STATE_ACTIVE ? "n active" : " backup", - new_slave->link == BOND_LINK_UP ? "n up" : " down"); - - //enslave is successfull - return 0; - -// Undo stages on error -err_close: - dev_close(slave_dev); - -err_restore_mac: - memcpy(addr.sa_data, new_slave->perm_hwaddr, ETH_ALEN); - addr.sa_family = slave_dev->type; - slave_dev->set_mac_address(slave_dev, &addr); - -err_free: - kfree(new_slave); - return err; -} - -/* - * This function changes the active slave to slave . - * It returns -EINVAL in the following cases. - * - is not found in the list. - * - There is not active slave now. - * - is already active. - * - The link state of is not BOND_LINK_UP. - * - is not running. - * In these cases, this fuction does nothing. - * In the other cases, currnt_slave pointer is changed and 0 is returned. - */ -static int bond_change_active(struct net_device *master_dev, struct net_device *slave_dev) -{ - bonding_t *bond; - slave_t *slave; - slave_t *oldactive = NULL; - slave_t *newactive = NULL; - unsigned long flags; - int ret = 0; - - if (master_dev == NULL || slave_dev == NULL) { - return -ENODEV; - } - - bond = (struct bonding *) master_dev->priv; - write_lock_irqsave(&bond->lock, flags); - slave = (slave_t *)bond; - oldactive = bond->current_slave; - - while ((slave = slave->prev) != (slave_t *)bond) { - if(slave_dev == slave->dev) { - newactive = slave; - break; - } - } - - if ((newactive != NULL)&& - (oldactive != NULL)&& - (newactive != oldactive)&& - (newactive->link == BOND_LINK_UP)&& - IS_UP(newactive->dev)) { - bond_set_slave_inactive_flags(oldactive); - bond_set_slave_active_flags(newactive); - bond_mc_update(bond, newactive, oldactive); - bond->current_slave = newactive; - printk("%s : activate %s(old : %s)\n", - master_dev->name, newactive->dev->name, - oldactive->dev->name); - } - else { - ret = -EINVAL; - } - write_unlock_irqrestore(&bond->lock, flags); - return ret; -} - -/* Choose a new valid interface from the pool, set it active - * and make it the current slave. If no valid interface is - * found, the oldest slave in BACK state is choosen and - * activated. If none is found, it's considered as no - * interfaces left so the current slave is set to NULL. - * The result is a pointer to the current slave. - * - * Since this function sends messages tails through printk, the caller - * must have started something like `printk(KERN_INFO "xxxx ");'. - * - * Warning: must put locks around the call to this function if needed. - */ -slave_t *change_active_interface(bonding_t *bond) -{ - slave_t *newslave, *oldslave; - slave_t *bestslave = NULL; - int mintime; - - read_lock(&bond->ptrlock); - newslave = oldslave = bond->current_slave; - read_unlock(&bond->ptrlock); - - if (newslave == NULL) { /* there were no active slaves left */ - if (bond->next != (slave_t *)bond) { /* found one slave */ - write_lock(&bond->ptrlock); - newslave = bond->current_slave = bond->next; - write_unlock(&bond->ptrlock); - } else { - - printk (" but could not find any %s interface.\n", - (bond_mode == BOND_MODE_ACTIVEBACKUP) ? "backup":"other"); - write_lock(&bond->ptrlock); - bond->current_slave = (slave_t *)NULL; - write_unlock(&bond->ptrlock); - return NULL; /* still no slave, return NULL */ - } - } else if (bond_mode == BOND_MODE_ACTIVEBACKUP) { - /* make sure oldslave doesn't send arps - this could - * cause a ping-pong effect between interfaces since they - * would be able to tx arps - in active backup only one - * slave should be able to tx arps, and that should be - * the current_slave; the only exception is when all - * slaves have gone down, then only one non-current slave can - * send arps at a time; clearing oldslaves' mc list is handled - * later in this function. - */ - bond_set_slave_inactive_flags(oldslave); - } - - mintime = updelay; - - /* first try the primary link; if arping, a link must tx/rx traffic - * before it can be considered the current_slave - also, we would skip - * slaves between the current_slave and primary_slave that may be up - * and able to arp - */ - if ((bond->primary_slave != NULL) && (arp_interval == 0)) { - if (IS_UP(bond->primary_slave->dev)) - newslave = bond->primary_slave; - } - - do { - if (IS_UP(newslave->dev)) { - if (newslave->link == BOND_LINK_UP) { - /* this one is immediately usable */ - if (bond_mode == BOND_MODE_ACTIVEBACKUP) { - bond_set_slave_active_flags(newslave); - bond_mc_update(bond, newslave, oldslave); - printk (" and making interface %s the active one.\n", - newslave->dev->name); - } - else { - printk (" and setting pointer to interface %s.\n", - newslave->dev->name); - } - - write_lock(&bond->ptrlock); - bond->current_slave = newslave; - write_unlock(&bond->ptrlock); - return newslave; - } - else if (newslave->link == BOND_LINK_BACK) { - /* link up, but waiting for stabilization */ - if (newslave->delay < mintime) { - mintime = newslave->delay; - bestslave = newslave; - } - } - } - } while ((newslave = newslave->next) != oldslave); - - /* no usable backup found, we'll see if we at least got a link that was - coming back for a long time, and could possibly already be usable. - */ - - if (bestslave != NULL) { - /* early take-over. */ - printk (" and making interface %s the active one %d ms earlier.\n", - bestslave->dev->name, - (updelay - bestslave->delay)*miimon); - - bestslave->delay = 0; - bestslave->link = BOND_LINK_UP; - bestslave->jiffies = jiffies; - bond_set_slave_active_flags(bestslave); - bond_mc_update(bond, bestslave, oldslave); - write_lock(&bond->ptrlock); - bond->current_slave = bestslave; - write_unlock(&bond->ptrlock); - return bestslave; - } - - if ((bond_mode == BOND_MODE_ACTIVEBACKUP) && - (multicast_mode == BOND_MULTICAST_ACTIVE) && - (oldslave != NULL)) { - /* flush bonds (master's) mc_list from oldslave since it wasn't - * updated (and deleted) above - */ - bond_mc_list_flush(oldslave->dev, bond->device); - if (bond->device->flags & IFF_PROMISC) { - dev_set_promiscuity(oldslave->dev, -1); - } - if (bond->device->flags & IFF_ALLMULTI) { - dev_set_allmulti(oldslave->dev, -1); - } - } - - printk (" but could not find any %s interface.\n", - (bond_mode == BOND_MODE_ACTIVEBACKUP) ? "backup":"other"); - - /* absolutely nothing found. let's return NULL */ - write_lock(&bond->ptrlock); - bond->current_slave = (slave_t *)NULL; - write_unlock(&bond->ptrlock); - return NULL; -} - -/* - * Try to release the slave device from the bond device - * It is legal to access current_slave without a lock because all the function - * is write-locked. - * - * The rules for slave state should be: - * for Active/Backup: - * Active stays on all backups go down - * for Bonded connections: - * The first up interface should be left on and all others downed. - */ -static int bond_release(struct net_device *master, struct net_device *slave) -{ - bonding_t *bond; - slave_t *our_slave, *old_current; - unsigned long flags; - struct sockaddr addr; - - if (master == NULL || slave == NULL) { - return -ENODEV; - } - - bond = (struct bonding *) master->priv; - - /* master already enslaved, or slave not enslaved, - or no slave for this master */ - if ((master->flags & IFF_SLAVE) || !(slave->flags & IFF_SLAVE)) { - printk (KERN_DEBUG "%s: cannot release %s.\n", master->name, slave->name); - return -EINVAL; - } - - write_lock_irqsave(&bond->lock, flags); - bond->current_arp_slave = NULL; - our_slave = (slave_t *)bond; - old_current = bond->current_slave; - while ((our_slave = our_slave->prev) != (slave_t *)bond) { - if (our_slave->dev == slave) { - bond_detach_slave(bond, our_slave); - - printk (KERN_INFO "%s: releasing %s interface %s", - master->name, - (our_slave->state == BOND_STATE_ACTIVE) ? "active" : "backup", - slave->name); - - if (our_slave == old_current) { - /* find a new interface and be verbose */ - change_active_interface(bond); - } else { - printk(".\n"); - } - - if (bond->current_slave == NULL) { - printk(KERN_INFO - "%s: now running without any active interface !\n", - master->name); - } - - if (bond->primary_slave == our_slave) { - bond->primary_slave = NULL; - } - - break; - } - - } - write_unlock_irqrestore(&bond->lock, flags); - - if (our_slave == (slave_t *)bond) { - /* if we get here, it's because the device was not found */ - printk (KERN_INFO "%s: %s not enslaved\n", master->name, slave->name); - return -EINVAL; - } - - /* undo settings and restore original values */ - - if (multicast_mode == BOND_MULTICAST_ALL) { - /* flush master's mc_list from slave */ - bond_mc_list_flush (slave, master); - - /* unset promiscuity level from slave */ - if (master->flags & IFF_PROMISC) - dev_set_promiscuity(slave, -1); - - /* unset allmulti level from slave */ - if (master->flags & IFF_ALLMULTI) - dev_set_allmulti(slave, -1); - } - - netdev_set_master(slave, NULL); - - /* close slave before restoring its mac address */ - dev_close(slave); - - /* restore original ("permanent") mac address*/ - memcpy(addr.sa_data, our_slave->perm_hwaddr, ETH_ALEN); - addr.sa_family = slave->type; - slave->set_mac_address(slave, &addr); - - /* restore the original state of the IFF_NOARP flag that might have */ - /* been set by bond_set_slave_inactive_flags() */ - if ((our_slave->original_flags & IFF_NOARP) == 0) { - slave->flags &= ~IFF_NOARP; - } - - kfree(our_slave); - - /* if the last slave was removed, zero the mac address - of the master so it will be set by the application - to the mac address of the first slave */ - if (bond->next == (slave_t*)bond) { - memset(master->dev_addr, 0, master->addr_len); - } - - return 0; /* deletion OK */ -} - -/* - * This function releases all slaves. - * Warning: must put write-locks around the call to this function. - */ -static int bond_release_all(struct net_device *master) -{ - bonding_t *bond; - slave_t *our_slave; - struct net_device *slave_dev; - struct sockaddr addr; - - if (master == NULL) { - return -ENODEV; - } - - if (master->flags & IFF_SLAVE) { - return -EINVAL; - } - - bond = (struct bonding *) master->priv; - bond->current_arp_slave = NULL; - bond->current_slave = NULL; - bond->primary_slave = NULL; - - while ((our_slave = bond->prev) != (slave_t *)bond) { - slave_dev = our_slave->dev; - bond_detach_slave(bond, our_slave); - - if (multicast_mode == BOND_MULTICAST_ALL - || (multicast_mode == BOND_MULTICAST_ACTIVE - && bond->current_slave == our_slave)) { - - /* flush master's mc_list from slave */ - bond_mc_list_flush (slave_dev, master); - - /* unset promiscuity level from slave */ - if (master->flags & IFF_PROMISC) - dev_set_promiscuity(slave_dev, -1); - - /* unset allmulti level from slave */ - if (master->flags & IFF_ALLMULTI) - dev_set_allmulti(slave_dev, -1); - } - - /* Can be safely called from inside the bond lock - since traffic and timers have already stopped - */ - netdev_set_master(slave_dev, NULL); - - /* close slave before restoring its mac address */ - dev_close(slave_dev); - - /* restore original ("permanent") mac address*/ - memcpy(addr.sa_data, our_slave->perm_hwaddr, ETH_ALEN); - addr.sa_family = slave_dev->type; - slave_dev->set_mac_address(slave_dev, &addr); - - /* restore the original state of the IFF_NOARP flag that might have */ - /* been set by bond_set_slave_inactive_flags() */ - if ((our_slave->original_flags & IFF_NOARP) == 0) { - slave_dev->flags &= ~IFF_NOARP; - } - - kfree(our_slave); - } - - /* zero the mac address of the master so it will be - set by the application to the mac address of the - first slave */ - memset(master->dev_addr, 0, master->addr_len); - - printk (KERN_INFO "%s: released all slaves\n", master->name); - - return 0; -} - -/* this function is called regularly to monitor each slave's link. */ -static void bond_mii_monitor(struct net_device *master) -{ - bonding_t *bond = (struct bonding *) master->priv; - slave_t *slave, *bestslave, *oldcurrent; - unsigned long flags; - int slave_died = 0; - - read_lock_irqsave(&bond->lock, flags); - - /* we will try to read the link status of each of our slaves, and - * set their IFF_RUNNING flag appropriately. For each slave not - * supporting MII status, we won't do anything so that a user-space - * program could monitor the link itself if needed. - */ - - bestslave = NULL; - slave = (slave_t *)bond; - - read_lock(&bond->ptrlock); - oldcurrent = bond->current_slave; - read_unlock(&bond->ptrlock); - - while ((slave = slave->prev) != (slave_t *)bond) { - /* use updelay+1 to match an UP slave even when updelay is 0 */ - int mindelay = updelay + 1; - struct net_device *dev = slave->dev; - int link_state; - - link_state = bond_check_dev_link(dev, 0); - - switch (slave->link) { - case BOND_LINK_UP: /* the link was up */ - if (link_state == BMSR_LSTATUS) { - /* link stays up, tell that this one - is immediately available */ - if (IS_UP(dev) && (mindelay > -2)) { - /* -2 is the best case : - this slave was already up */ - mindelay = -2; - bestslave = slave; - } - break; - } - else { /* link going down */ - slave->link = BOND_LINK_FAIL; - slave->delay = downdelay; - if (slave->link_failure_count < UINT_MAX) { - slave->link_failure_count++; - } - if (downdelay > 0) { - printk (KERN_INFO - "%s: link status down for %sinterface " - "%s, disabling it in %d ms.\n", - master->name, - IS_UP(dev) - ? ((bond_mode == BOND_MODE_ACTIVEBACKUP) - ? ((slave == oldcurrent) - ? "active " : "backup ") - : "") - : "idle ", - dev->name, - downdelay * miimon); - } - } - /* no break ! fall through the BOND_LINK_FAIL test to - ensure proper action to be taken - */ - case BOND_LINK_FAIL: /* the link has just gone down */ - if (link_state != BMSR_LSTATUS) { - /* link stays down */ - if (slave->delay <= 0) { - /* link down for too long time */ - slave->link = BOND_LINK_DOWN; - /* in active/backup mode, we must - completely disable this interface */ - if (bond_mode == BOND_MODE_ACTIVEBACKUP) { - bond_set_slave_inactive_flags(slave); - } - printk(KERN_INFO - "%s: link status definitely down " - "for interface %s, disabling it", - master->name, - dev->name); - - read_lock(&bond->ptrlock); - if (slave == bond->current_slave) { - read_unlock(&bond->ptrlock); - /* find a new interface and be verbose */ - change_active_interface(bond); - } else { - read_unlock(&bond->ptrlock); - printk(".\n"); - } - slave_died = 1; - } else { - slave->delay--; - } - } else { - /* link up again */ - slave->link = BOND_LINK_UP; - slave->jiffies = jiffies; - printk(KERN_INFO - "%s: link status up again after %d ms " - "for interface %s.\n", - master->name, - (downdelay - slave->delay) * miimon, - dev->name); - - if (IS_UP(dev) && (mindelay > -1)) { - /* -1 is a good case : this slave went - down only for a short time */ - mindelay = -1; - bestslave = slave; - } - } - break; - case BOND_LINK_DOWN: /* the link was down */ - if (link_state != BMSR_LSTATUS) { - /* the link stays down, nothing more to do */ - break; - } else { /* link going up */ - slave->link = BOND_LINK_BACK; - slave->delay = updelay; - - if (updelay > 0) { - /* if updelay == 0, no need to - advertise about a 0 ms delay */ - printk (KERN_INFO - "%s: link status up for interface" - " %s, enabling it in %d ms.\n", - master->name, - dev->name, - updelay * miimon); - } - } - /* no break ! fall through the BOND_LINK_BACK state in - case there's something to do. - */ - case BOND_LINK_BACK: /* the link has just come back */ - if (link_state != BMSR_LSTATUS) { - /* link down again */ - slave->link = BOND_LINK_DOWN; - printk(KERN_INFO - "%s: link status down again after %d ms " - "for interface %s.\n", - master->name, - (updelay - slave->delay) * miimon, - dev->name); - } else { - /* link stays up */ - if (slave->delay == 0) { - /* now the link has been up for long time enough */ - slave->link = BOND_LINK_UP; - slave->jiffies = jiffies; - - if (bond_mode != BOND_MODE_ACTIVEBACKUP) { - /* make it immediately active */ - slave->state = BOND_STATE_ACTIVE; - } else if (slave != bond->primary_slave) { - /* prevent it from being the active one */ - slave->state = BOND_STATE_BACKUP; - } - - printk(KERN_INFO - "%s: link status definitely up " - "for interface %s.\n", - master->name, - dev->name); - - if ( (bond->primary_slave != NULL) - && (slave == bond->primary_slave) ) - change_active_interface(bond); - } - else - slave->delay--; - - /* we'll also look for the mostly eligible slave */ - if (bond->primary_slave == NULL) { - if (IS_UP(dev) && (slave->delay < mindelay)) { - mindelay = slave->delay; - bestslave = slave; - } - } else if ( (IS_UP(bond->primary_slave->dev)) || - ( (!IS_UP(bond->primary_slave->dev)) && - (IS_UP(dev) && (slave->delay < mindelay)) ) ) { - mindelay = slave->delay; - bestslave = slave; - } - } - break; - } /* end of switch */ - - bond_update_speed_duplex(slave); - - } /* end of while */ - - /* - * if there's no active interface and we discovered that one - * of the slaves could be activated earlier, so we do it. - */ - read_lock(&bond->ptrlock); - oldcurrent = bond->current_slave; - read_unlock(&bond->ptrlock); - - /* no active interface at the moment or need to bring up the primary */ - if (oldcurrent == NULL) { /* no active interface at the moment */ - if (bestslave != NULL) { /* last chance to find one ? */ - if (bestslave->link == BOND_LINK_UP) { - printk (KERN_INFO - "%s: making interface %s the new active one.\n", - master->name, bestslave->dev->name); - } else { - printk (KERN_INFO - "%s: making interface %s the new " - "active one %d ms earlier.\n", - master->name, bestslave->dev->name, - (updelay - bestslave->delay) * miimon); - - bestslave->delay = 0; - bestslave->link = BOND_LINK_UP; - bestslave->jiffies = jiffies; - } - - if (bond_mode == BOND_MODE_ACTIVEBACKUP) { - bond_set_slave_active_flags(bestslave); - bond_mc_update(bond, bestslave, NULL); - } else { - bestslave->state = BOND_STATE_ACTIVE; - } - write_lock(&bond->ptrlock); - bond->current_slave = bestslave; - write_unlock(&bond->ptrlock); - } else if (slave_died) { - /* print this message only once a slave has just died */ - printk(KERN_INFO - "%s: now running without any active interface !\n", - master->name); - } - } - - read_unlock_irqrestore(&bond->lock, flags); - /* re-arm the timer */ - mod_timer(&bond->mii_timer, jiffies + (miimon * HZ / 1000)); -} - -/* - * this function is called regularly to monitor each slave's link - * ensuring that traffic is being sent and received when arp monitoring - * is used in load-balancing mode. if the adapter has been dormant, then an - * arp is transmitted to generate traffic. see activebackup_arp_monitor for - * arp monitoring in active backup mode. - */ -static void loadbalance_arp_monitor(struct net_device *master) -{ - bonding_t *bond; - unsigned long flags; - slave_t *slave; - int the_delta_in_ticks = arp_interval * HZ / 1000; - int next_timer = jiffies + (arp_interval * HZ / 1000); - - bond = (struct bonding *) master->priv; - if (master->priv == NULL) { - mod_timer(&bond->arp_timer, next_timer); - return; - } - - read_lock_irqsave(&bond->lock, flags); - - /* TODO: investigate why rtnl_shlock_nowait and rtnl_exlock_nowait - * are called below and add comment why they are required... - */ - if ((!IS_UP(master)) || rtnl_shlock_nowait()) { - mod_timer(&bond->arp_timer, next_timer); - read_unlock_irqrestore(&bond->lock, flags); - return; - } - - if (rtnl_exlock_nowait()) { - rtnl_shunlock(); - mod_timer(&bond->arp_timer, next_timer); - read_unlock_irqrestore(&bond->lock, flags); - return; - } - - /* see if any of the previous devices are up now (i.e. they have - * xmt and rcv traffic). the current_slave does not come into - * the picture unless it is null. also, slave->jiffies is not needed - * here because we send an arp on each slave and give a slave as - * long as it needs to get the tx/rx within the delta. - * TODO: what about up/down delay in arp mode? it wasn't here before - * so it can wait - */ - slave = (slave_t *)bond; - while ((slave = slave->prev) != (slave_t *)bond) { - - if (slave->link != BOND_LINK_UP) { - - if (((jiffies - slave->dev->trans_start) <= - the_delta_in_ticks) && - ((jiffies - slave->dev->last_rx) <= - the_delta_in_ticks)) { - - slave->link = BOND_LINK_UP; - slave->state = BOND_STATE_ACTIVE; - - /* primary_slave has no meaning in round-robin - * mode. the window of a slave being up and - * current_slave being null after enslaving - * is closed. - */ - read_lock(&bond->ptrlock); - if (bond->current_slave == NULL) { - read_unlock(&bond->ptrlock); - printk(KERN_INFO - "%s: link status definitely up " - "for interface %s, ", - master->name, - slave->dev->name); - change_active_interface(bond); - } else { - read_unlock(&bond->ptrlock); - printk(KERN_INFO - "%s: interface %s is now up\n", - master->name, - slave->dev->name); - } - } - } else { - /* slave->link == BOND_LINK_UP */ - - /* not all switches will respond to an arp request - * when the source ip is 0, so don't take the link down - * if we don't know our ip yet - */ - if (((jiffies - slave->dev->trans_start) >= - (2*the_delta_in_ticks)) || - (((jiffies - slave->dev->last_rx) >= - (2*the_delta_in_ticks)) && my_ip !=0)) { - slave->link = BOND_LINK_DOWN; - slave->state = BOND_STATE_BACKUP; - if (slave->link_failure_count < UINT_MAX) { - slave->link_failure_count++; - } - printk(KERN_INFO - "%s: interface %s is now down.\n", - master->name, - slave->dev->name); - - read_lock(&bond->ptrlock); - if (slave == bond->current_slave) { - read_unlock(&bond->ptrlock); - change_active_interface(bond); - } else { - read_unlock(&bond->ptrlock); - } - } - } - - /* note: if switch is in round-robin mode, all links - * must tx arp to ensure all links rx an arp - otherwise - * links may oscillate or not come up at all; if switch is - * in something like xor mode, there is nothing we can - * do - all replies will be rx'ed on same link causing slaves - * to be unstable during low/no traffic periods - */ - if (IS_UP(slave->dev)) { - arp_send_all(slave); - } - } - - rtnl_exunlock(); - rtnl_shunlock(); - read_unlock_irqrestore(&bond->lock, flags); - - /* re-arm the timer */ - mod_timer(&bond->arp_timer, next_timer); -} - -/* - * When using arp monitoring in active-backup mode, this function is - * called to determine if any backup slaves have went down or a new - * current slave needs to be found. - * The backup slaves never generate traffic, they are considered up by merely - * receiving traffic. If the current slave goes down, each backup slave will - * be given the opportunity to tx/rx an arp before being taken down - this - * prevents all slaves from being taken down due to the current slave not - * sending any traffic for the backups to receive. The arps are not necessarily - * necessary, any tx and rx traffic will keep the current slave up. While any - * rx traffic will keep the backup slaves up, the current slave is responsible - * for generating traffic to keep them up regardless of any other traffic they - * may have received. - * see loadbalance_arp_monitor for arp monitoring in load balancing mode - */ -static void activebackup_arp_monitor(struct net_device *master) -{ - bonding_t *bond; - unsigned long flags; - slave_t *slave; - int the_delta_in_ticks = arp_interval * HZ / 1000; - int next_timer = jiffies + (arp_interval * HZ / 1000); - - bond = (struct bonding *) master->priv; - if (master->priv == NULL) { - mod_timer(&bond->arp_timer, next_timer); - return; - } - - read_lock_irqsave(&bond->lock, flags); - - if (!IS_UP(master)) { - mod_timer(&bond->arp_timer, next_timer); - read_unlock_irqrestore(&bond->lock, flags); - return; - } - - /* determine if any slave has come up or any backup slave has - * gone down - * TODO: what about up/down delay in arp mode? it wasn't here before - * so it can wait - */ - slave = (slave_t *)bond; - while ((slave = slave->prev) != (slave_t *)bond) { - - if (slave->link != BOND_LINK_UP) { - if ((jiffies - slave->dev->last_rx) <= - the_delta_in_ticks) { - - slave->link = BOND_LINK_UP; - write_lock(&bond->ptrlock); - if ((bond->current_slave == NULL) && - ((jiffies - slave->dev->trans_start) <= - the_delta_in_ticks)) { - bond->current_slave = slave; - bond_set_slave_active_flags(slave); - bond_mc_update(bond, slave, NULL); - bond->current_arp_slave = NULL; - } else if (bond->current_slave != slave) { - /* this slave has just come up but we - * already have a current slave; this - * can also happen if bond_enslave adds - * a new slave that is up while we are - * searching for a new slave - */ - bond_set_slave_inactive_flags(slave); - bond->current_arp_slave = NULL; - } - - if (slave == bond->current_slave) { - printk(KERN_INFO - "%s: %s is up and now the " - "active interface\n", - master->name, - slave->dev->name); - } else { - printk(KERN_INFO - "%s: backup interface %s is " - "now up\n", - master->name, - slave->dev->name); - } - - write_unlock(&bond->ptrlock); - } - } else { - read_lock(&bond->ptrlock); - if ((slave != bond->current_slave) && - (bond->current_arp_slave == NULL) && - (((jiffies - slave->dev->last_rx) >= - 3*the_delta_in_ticks) && (my_ip != 0))) { - /* a backup slave has gone down; three times - * the delta allows the current slave to be - * taken out before the backup slave. - * note: a non-null current_arp_slave indicates - * the current_slave went down and we are - * searching for a new one; under this - * condition we only take the current_slave - * down - this gives each slave a chance to - * tx/rx traffic before being taken out - */ - read_unlock(&bond->ptrlock); - slave->link = BOND_LINK_DOWN; - if (slave->link_failure_count < UINT_MAX) { - slave->link_failure_count++; - } - bond_set_slave_inactive_flags(slave); - printk(KERN_INFO - "%s: backup interface %s is now down\n", - master->name, - slave->dev->name); - } else { - read_unlock(&bond->ptrlock); - } - } - } - - read_lock(&bond->ptrlock); - slave = bond->current_slave; - read_unlock(&bond->ptrlock); - - if (slave != NULL) { - - /* if we have sent traffic in the past 2*arp_intervals but - * haven't xmit and rx traffic in that time interval, select - * a different slave. slave->jiffies is only updated when - * a slave first becomes the current_slave - not necessarily - * after every arp; this ensures the slave has a full 2*delta - * before being taken out. if a primary is being used, check - * if it is up and needs to take over as the current_slave - */ - if ((((jiffies - slave->dev->trans_start) >= - (2*the_delta_in_ticks)) || - (((jiffies - slave->dev->last_rx) >= - (2*the_delta_in_ticks)) && (my_ip != 0))) && - ((jiffies - slave->jiffies) >= 2*the_delta_in_ticks)) { - - slave->link = BOND_LINK_DOWN; - if (slave->link_failure_count < UINT_MAX) { - slave->link_failure_count++; - } - printk(KERN_INFO "%s: link status down for " - "active interface %s, disabling it", - master->name, - slave->dev->name); - slave = change_active_interface(bond); - bond->current_arp_slave = slave; - if (slave != NULL) { - slave->jiffies = jiffies; - } - - } else if ((bond->primary_slave != NULL) && - (bond->primary_slave != slave) && - (bond->primary_slave->link == BOND_LINK_UP)) { - /* at this point, slave is the current_slave */ - printk(KERN_INFO - "%s: changing from interface %s to primary " - "interface %s\n", - master->name, - slave->dev->name, - bond->primary_slave->dev->name); - - /* primary is up so switch to it */ - bond_set_slave_inactive_flags(slave); - bond_mc_update(bond, bond->primary_slave, slave); - write_lock(&bond->ptrlock); - bond->current_slave = bond->primary_slave; - write_unlock(&bond->ptrlock); - slave = bond->primary_slave; - bond_set_slave_active_flags(slave); - slave->jiffies = jiffies; - } else { - bond->current_arp_slave = NULL; - } - - /* the current slave must tx an arp to ensure backup slaves - * rx traffic - */ - if ((slave != NULL) && - (((jiffies - slave->dev->last_rx) >= the_delta_in_ticks) && - (my_ip != 0))) { - arp_send_all(slave); - } - } - - /* if we don't have a current_slave, search for the next available - * backup slave from the current_arp_slave and make it the candidate - * for becoming the current_slave - */ - if (slave == NULL) { - - if ((bond->current_arp_slave == NULL) || - (bond->current_arp_slave == (slave_t *)bond)) { - bond->current_arp_slave = bond->prev; - } - - if (bond->current_arp_slave != (slave_t *)bond) { - bond_set_slave_inactive_flags(bond->current_arp_slave); - slave = bond->current_arp_slave->next; - - /* search for next candidate */ - do { - if (IS_UP(slave->dev)) { - slave->link = BOND_LINK_BACK; - bond_set_slave_active_flags(slave); - arp_send_all(slave); - slave->jiffies = jiffies; - bond->current_arp_slave = slave; - break; - } - - /* if the link state is up at this point, we - * mark it down - this can happen if we have - * simultaneous link failures and - * change_active_interface doesn't make this - * one the current slave so it is still marked - * up when it is actually down - */ - if (slave->link == BOND_LINK_UP) { - slave->link = BOND_LINK_DOWN; - if (slave->link_failure_count < - UINT_MAX) { - slave->link_failure_count++; - } - - bond_set_slave_inactive_flags(slave); - printk(KERN_INFO - "%s: backup interface " - "%s is now down.\n", - master->name, - slave->dev->name); - } - } while ((slave = slave->next) != - bond->current_arp_slave->next); - } - } - - mod_timer(&bond->arp_timer, next_timer); - read_unlock_irqrestore(&bond->lock, flags); -} - -typedef uint32_t in_addr_t; - -int -my_inet_aton(char *cp, unsigned long *the_addr) { - static const in_addr_t max[4] = { 0xffffffff, 0xffffff, 0xffff, 0xff }; - in_addr_t val; - char c; - union iaddr { - uint8_t bytes[4]; - uint32_t word; - } res; - uint8_t *pp = res.bytes; - int digit,base; - - res.word = 0; - - c = *cp; - for (;;) { - /* - * Collect number up to ``.''. - * Values are specified as for C: - * 0x=hex, 0=octal, isdigit=decimal. - */ - if (!isdigit(c)) goto ret_0; - val = 0; base = 10; digit = 0; - for (;;) { - if (isdigit(c)) { - val = (val * base) + (c - '0'); - c = *++cp; - digit = 1; - } else { - break; - } - } - if (c == '.') { - /* - * Internet format: - * a.b.c.d - * a.b.c (with c treated as 16 bits) - * a.b (with b treated as 24 bits) - */ - if (pp > res.bytes + 2 || val > 0xff) { - goto ret_0; - } - *pp++ = val; - c = *++cp; - } else - break; - } - /* - * Check for trailing characters. - */ - if (c != '\0' && (!isascii(c) || !isspace(c))) { - goto ret_0; - } - /* - * Did we get a valid digit? - */ - if (!digit) { - goto ret_0; - } - - /* Check whether the last part is in its limits depending on - the number of parts in total. */ - if (val > max[pp - res.bytes]) { - goto ret_0; - } - - if (the_addr != NULL) { - *the_addr = res.word | htonl (val); - } - - return (1); - -ret_0: - return (0); -} - -static int bond_sethwaddr(struct net_device *master, struct net_device *slave) -{ -#ifdef BONDING_DEBUG - printk(KERN_CRIT "bond_sethwaddr: master=%x\n", (unsigned int)master); - printk(KERN_CRIT "bond_sethwaddr: slave=%x\n", (unsigned int)slave); - printk(KERN_CRIT "bond_sethwaddr: slave->addr_len=%d\n", slave->addr_len); -#endif - memcpy(master->dev_addr, slave->dev_addr, slave->addr_len); - return 0; -} - -static int bond_info_query(struct net_device *master, struct ifbond *info) -{ - bonding_t *bond = (struct bonding *) master->priv; - slave_t *slave; - unsigned long flags; - - info->bond_mode = bond_mode; - info->num_slaves = 0; - info->miimon = miimon; - - read_lock_irqsave(&bond->lock, flags); - for (slave = bond->prev; slave != (slave_t *)bond; slave = slave->prev) { - info->num_slaves++; - } - read_unlock_irqrestore(&bond->lock, flags); - - return 0; -} - -static int bond_slave_info_query(struct net_device *master, - struct ifslave *info) -{ - bonding_t *bond = (struct bonding *) master->priv; - slave_t *slave; - int cur_ndx = 0; - unsigned long flags; - - if (info->slave_id < 0) { - return -ENODEV; - } - - read_lock_irqsave(&bond->lock, flags); - for (slave = bond->prev; - slave != (slave_t *)bond && cur_ndx < info->slave_id; - slave = slave->prev) { - cur_ndx++; - } - read_unlock_irqrestore(&bond->lock, flags); - - if (slave != (slave_t *)bond) { - strcpy(info->slave_name, slave->dev->name); - info->link = slave->link; - info->state = slave->state; - info->link_failure_count = slave->link_failure_count; - } else { - return -ENODEV; - } - - return 0; -} - -static int bond_ioctl(struct net_device *master_dev, struct ifreq *ifr, int cmd) -{ - struct net_device *slave_dev = NULL; - struct ifbond *u_binfo = NULL, k_binfo; - struct ifslave *u_sinfo = NULL, k_sinfo; - struct mii_ioctl_data *mii = NULL; - int ret = 0; - -#ifdef BONDING_DEBUG - printk(KERN_INFO "bond_ioctl: master=%s, cmd=%d\n", - master_dev->name, cmd); -#endif - - switch (cmd) { - case SIOCGMIIPHY: - mii = (struct mii_ioctl_data *)&ifr->ifr_data; - if (mii == NULL) { - return -EINVAL; - } - mii->phy_id = 0; - /* Fall Through */ - case SIOCGMIIREG: - /* - * We do this again just in case we were called by SIOCGMIIREG - * instead of SIOCGMIIPHY. - */ - mii = (struct mii_ioctl_data *)&ifr->ifr_data; - if (mii == NULL) { - return -EINVAL; - } - if (mii->reg_num == 1) { - mii->val_out = bond_check_mii_link( - (struct bonding *)master_dev->priv); - } - return 0; - case BOND_INFO_QUERY_OLD: - case SIOCBONDINFOQUERY: - u_binfo = (struct ifbond *)ifr->ifr_data; - if (copy_from_user(&k_binfo, u_binfo, sizeof(ifbond))) { - return -EFAULT; - } - ret = bond_info_query(master_dev, &k_binfo); - if (ret == 0) { - if (copy_to_user(u_binfo, &k_binfo, sizeof(ifbond))) { - return -EFAULT; - } - } - return ret; - case BOND_SLAVE_INFO_QUERY_OLD: - case SIOCBONDSLAVEINFOQUERY: - u_sinfo = (struct ifslave *)ifr->ifr_data; - if (copy_from_user(&k_sinfo, u_sinfo, sizeof(ifslave))) { - return -EFAULT; - } - ret = bond_slave_info_query(master_dev, &k_sinfo); - if (ret == 0) { - if (copy_to_user(u_sinfo, &k_sinfo, sizeof(ifslave))) { - return -EFAULT; - } - } - return ret; - } - - if (!capable(CAP_NET_ADMIN)) { - return -EPERM; - } - - slave_dev = dev_get_by_name(ifr->ifr_slave); - -#ifdef BONDING_DEBUG - printk(KERN_INFO "slave_dev=%x: \n", (unsigned int)slave_dev); - printk(KERN_INFO "slave_dev->name=%s: \n", slave_dev->name); -#endif - - if (slave_dev == NULL) { - ret = -ENODEV; - } else { - switch (cmd) { - case BOND_ENSLAVE_OLD: - case SIOCBONDENSLAVE: - ret = bond_enslave(master_dev, slave_dev); - break; - case BOND_RELEASE_OLD: - case SIOCBONDRELEASE: - ret = bond_release(master_dev, slave_dev); - break; - case BOND_SETHWADDR_OLD: - case SIOCBONDSETHWADDR: - ret = bond_sethwaddr(master_dev, slave_dev); - break; - case BOND_CHANGE_ACTIVE_OLD: - case SIOCBONDCHANGEACTIVE: - if (bond_mode == BOND_MODE_ACTIVEBACKUP) { - ret = bond_change_active(master_dev, slave_dev); - } - else { - ret = -EINVAL; - } - break; - default: - ret = -EOPNOTSUPP; - } - dev_put(slave_dev); - } - return ret; -} - -#ifdef CONFIG_NET_FASTROUTE -static int bond_accept_fastpath(struct net_device *dev, struct dst_entry *dst) -{ - return -1; -} -#endif - -/* - * in broadcast mode, we send everything to all usable interfaces. - */ -static int bond_xmit_broadcast(struct sk_buff *skb, struct net_device *dev) -{ - slave_t *slave, *start_at; - struct bonding *bond = (struct bonding *) dev->priv; - unsigned long flags; - struct net_device *device_we_should_send_to = 0; - - if (!IS_UP(dev)) { /* bond down */ - dev_kfree_skb(skb); - return 0; - } - - read_lock_irqsave(&bond->lock, flags); - - read_lock(&bond->ptrlock); - slave = start_at = bond->current_slave; - read_unlock(&bond->ptrlock); - - if (slave == NULL) { /* we're at the root, get the first slave */ - /* no suitable interface, frame not sent */ - read_unlock_irqrestore(&bond->lock, flags); - dev_kfree_skb(skb); - return 0; - } - - do { - if (IS_UP(slave->dev) - && (slave->link == BOND_LINK_UP) - && (slave->state == BOND_STATE_ACTIVE)) { - if (device_we_should_send_to) { - struct sk_buff *skb2; - if ((skb2 = skb_clone(skb, GFP_ATOMIC)) == NULL) { - printk(KERN_ERR "bond_xmit_broadcast: skb_clone() failed\n"); - continue; - } - - skb2->dev = device_we_should_send_to; - skb2->priority = 1; - dev_queue_xmit(skb2); - } - device_we_should_send_to = slave->dev; - } - } while ((slave = slave->next) != start_at); - - if (device_we_should_send_to) { - skb->dev = device_we_should_send_to; - skb->priority = 1; - dev_queue_xmit(skb); - } else - dev_kfree_skb(skb); - - /* frame sent to all suitable interfaces */ - read_unlock_irqrestore(&bond->lock, flags); - return 0; -} - -static int bond_xmit_roundrobin(struct sk_buff *skb, struct net_device *dev) -{ - slave_t *slave, *start_at; - struct bonding *bond = (struct bonding *) dev->priv; - unsigned long flags; - - if (!IS_UP(dev)) { /* bond down */ - dev_kfree_skb(skb); - return 0; - } - - read_lock_irqsave(&bond->lock, flags); - - read_lock(&bond->ptrlock); - slave = start_at = bond->current_slave; - read_unlock(&bond->ptrlock); - - if (slave == NULL) { /* we're at the root, get the first slave */ - /* no suitable interface, frame not sent */ - dev_kfree_skb(skb); - read_unlock_irqrestore(&bond->lock, flags); - return 0; - } - - do { - if (IS_UP(slave->dev) - && (slave->link == BOND_LINK_UP) - && (slave->state == BOND_STATE_ACTIVE)) { - - skb->dev = slave->dev; - skb->priority = 1; - dev_queue_xmit(skb); - - write_lock(&bond->ptrlock); - bond->current_slave = slave->next; - write_unlock(&bond->ptrlock); - - read_unlock_irqrestore(&bond->lock, flags); - return 0; - } - } while ((slave = slave->next) != start_at); - - /* no suitable interface, frame not sent */ - dev_kfree_skb(skb); - read_unlock_irqrestore(&bond->lock, flags); - return 0; -} - -/* - * in XOR mode, we determine the output device by performing xor on - * the source and destination hw adresses. If this device is not - * enabled, find the next slave following this xor slave. - */ -static int bond_xmit_xor(struct sk_buff *skb, struct net_device *dev) -{ - slave_t *slave, *start_at; - struct bonding *bond = (struct bonding *) dev->priv; - unsigned long flags; - struct ethhdr *data = (struct ethhdr *)skb->data; - int slave_no; - - if (!IS_UP(dev)) { /* bond down */ - dev_kfree_skb(skb); - return 0; - } - - read_lock_irqsave(&bond->lock, flags); - slave = bond->prev; - - /* we're at the root, get the first slave */ - if (bond->slave_cnt == 0) { - /* no suitable interface, frame not sent */ - dev_kfree_skb(skb); - read_unlock_irqrestore(&bond->lock, flags); - return 0; - } - - slave_no = (data->h_dest[5]^slave->dev->dev_addr[5]) % bond->slave_cnt; - - while ( (slave_no > 0) && (slave != (slave_t *)bond) ) { - slave = slave->prev; - slave_no--; - } - start_at = slave; - - do { - if (IS_UP(slave->dev) - && (slave->link == BOND_LINK_UP) - && (slave->state == BOND_STATE_ACTIVE)) { - - skb->dev = slave->dev; - skb->priority = 1; - dev_queue_xmit(skb); - - read_unlock_irqrestore(&bond->lock, flags); - return 0; - } - } while ((slave = slave->next) != start_at); - - /* no suitable interface, frame not sent */ - dev_kfree_skb(skb); - read_unlock_irqrestore(&bond->lock, flags); - return 0; -} - -/* - * in active-backup mode, we know that bond->current_slave is always valid if - * the bond has a usable interface. - */ -static int bond_xmit_activebackup(struct sk_buff *skb, struct net_device *dev) -{ - struct bonding *bond = (struct bonding *) dev->priv; - unsigned long flags; - int ret; - - if (!IS_UP(dev)) { /* bond down */ - dev_kfree_skb(skb); - return 0; - } - - /* if we are sending arp packets, try to at least - identify our own ip address */ - if ( (arp_interval > 0) && (my_ip == 0) && - (skb->protocol == __constant_htons(ETH_P_ARP) ) ) { - char *the_ip = (((char *)skb->data)) - + sizeof(struct ethhdr) - + sizeof(struct arphdr) + - ETH_ALEN; - memcpy(&my_ip, the_ip, 4); - } - - /* if we are sending arp packets and don't know - * the target hw address, save it so we don't need - * to use a broadcast address. - * don't do this if in active backup mode because the slaves must - * receive packets to stay up, and the only ones they receive are - * broadcasts. - */ - if ( (bond_mode != BOND_MODE_ACTIVEBACKUP) && - (arp_ip_count == 1) && - (arp_interval > 0) && (arp_target_hw_addr == NULL) && - (skb->protocol == __constant_htons(ETH_P_IP) ) ) { - struct ethhdr *eth_hdr = - (struct ethhdr *) (((char *)skb->data)); - struct iphdr *ip_hdr = (struct iphdr *)(eth_hdr + 1); - - if (arp_target[0] == ip_hdr->daddr) { - arp_target_hw_addr = kmalloc(ETH_ALEN, GFP_KERNEL); - if (arp_target_hw_addr != NULL) - memcpy(arp_target_hw_addr, eth_hdr->h_dest, ETH_ALEN); - } - } - - read_lock_irqsave(&bond->lock, flags); - - read_lock(&bond->ptrlock); - if (bond->current_slave != NULL) { /* one usable interface */ - skb->dev = bond->current_slave->dev; - read_unlock(&bond->ptrlock); - skb->priority = 1; - ret = dev_queue_xmit(skb); - read_unlock_irqrestore(&bond->lock, flags); - return 0; - } - else { - read_unlock(&bond->ptrlock); - } - - /* no suitable interface, frame not sent */ -#ifdef BONDING_DEBUG - printk(KERN_INFO "There was no suitable interface, so we don't transmit\n"); -#endif - dev_kfree_skb(skb); - read_unlock_irqrestore(&bond->lock, flags); - return 0; -} - -static struct net_device_stats *bond_get_stats(struct net_device *dev) -{ - bonding_t *bond = dev->priv; - struct net_device_stats *stats = bond->stats, *sstats; - slave_t *slave; - unsigned long flags; - - memset(bond->stats, 0, sizeof(struct net_device_stats)); - - read_lock_irqsave(&bond->lock, flags); - - for (slave = bond->prev; slave != (slave_t *)bond; slave = slave->prev) { - sstats = slave->dev->get_stats(slave->dev); - - stats->rx_packets += sstats->rx_packets; - stats->rx_bytes += sstats->rx_bytes; - stats->rx_errors += sstats->rx_errors; - stats->rx_dropped += sstats->rx_dropped; - - stats->tx_packets += sstats->tx_packets; - stats->tx_bytes += sstats->tx_bytes; - stats->tx_errors += sstats->tx_errors; - stats->tx_dropped += sstats->tx_dropped; - - stats->multicast += sstats->multicast; - stats->collisions += sstats->collisions; - - stats->rx_length_errors += sstats->rx_length_errors; - stats->rx_over_errors += sstats->rx_over_errors; - stats->rx_crc_errors += sstats->rx_crc_errors; - stats->rx_frame_errors += sstats->rx_frame_errors; - stats->rx_fifo_errors += sstats->rx_fifo_errors; - stats->rx_missed_errors += sstats->rx_missed_errors; - - stats->tx_aborted_errors += sstats->tx_aborted_errors; - stats->tx_carrier_errors += sstats->tx_carrier_errors; - stats->tx_fifo_errors += sstats->tx_fifo_errors; - stats->tx_heartbeat_errors += sstats->tx_heartbeat_errors; - stats->tx_window_errors += sstats->tx_window_errors; - - } - - read_unlock_irqrestore(&bond->lock, flags); - return stats; -} - -static int bond_get_info(char *buf, char **start, off_t offset, int length) -{ - bonding_t *bond = these_bonds; - int len = 0; - off_t begin = 0; - u16 link; - slave_t *slave = NULL; - unsigned long flags; - - while (bond != NULL) { - /* - * This function locks the mutex, so we can't lock it until - * afterwards - */ - link = bond_check_mii_link(bond); - - len += sprintf(buf + len, "Bonding Mode: %s\n", - bond_mode_name()); - - if (bond_mode == BOND_MODE_ACTIVEBACKUP) { - read_lock_irqsave(&bond->lock, flags); - read_lock(&bond->ptrlock); - if (bond->current_slave != NULL) { - len += sprintf(buf + len, - "Currently Active Slave: %s\n", - bond->current_slave->dev->name); - } - read_unlock(&bond->ptrlock); - read_unlock_irqrestore(&bond->lock, flags); - } - - len += sprintf(buf + len, "MII Status: "); - len += sprintf(buf + len, - link == BMSR_LSTATUS ? "up\n" : "down\n"); - len += sprintf(buf + len, "MII Polling Interval (ms): %d\n", - miimon); - len += sprintf(buf + len, "Up Delay (ms): %d\n", - updelay * miimon); - len += sprintf(buf + len, "Down Delay (ms): %d\n", - downdelay * miimon); - len += sprintf(buf + len, "Multicast Mode: %s\n", - multicast_mode_name()); - - read_lock_irqsave(&bond->lock, flags); - for (slave = bond->prev; slave != (slave_t *)bond; - slave = slave->prev) { - len += sprintf(buf + len, "\nSlave Interface: %s\n", slave->dev->name); - - len += sprintf(buf + len, "MII Status: "); - - len += sprintf(buf + len, - slave->link == BOND_LINK_UP ? - "up\n" : "down\n"); - len += sprintf(buf + len, "Link Failure Count: %d\n", - slave->link_failure_count); - - len += sprintf(buf + len, - "Permanent HW addr: %02x:%02x:%02x:%02x:%02x:%02x\n", - slave->perm_hwaddr[0], - slave->perm_hwaddr[1], - slave->perm_hwaddr[2], - slave->perm_hwaddr[3], - slave->perm_hwaddr[4], - slave->perm_hwaddr[5]); - } - read_unlock_irqrestore(&bond->lock, flags); - - /* - * Figure out the calcs for the /proc/net interface - */ - *start = buf + (offset - begin); - len -= (offset - begin); - if (len > length) { - len = length; - } - if (len < 0) { - len = 0; - } - - - bond = bond->next_bond; - } - return len; -} - -static int bond_event(struct notifier_block *this, unsigned long event, - void *ptr) -{ - struct bonding *this_bond = (struct bonding *)these_bonds; - struct bonding *last_bond; - struct net_device *event_dev = (struct net_device *)ptr; - - /* while there are bonds configured */ - while (this_bond != NULL) { - if (this_bond == event_dev->priv ) { - switch (event) { - case NETDEV_UNREGISTER: - /* - * remove this bond from a linked list of - * bonds - */ - if (this_bond == these_bonds) { - these_bonds = this_bond->next_bond; - } else { - for (last_bond = these_bonds; - last_bond != NULL; - last_bond = last_bond->next_bond) { - if (last_bond->next_bond == - this_bond) { - last_bond->next_bond = - this_bond->next_bond; - } - } - } - return NOTIFY_DONE; - - default: - return NOTIFY_DONE; - } - } else if (this_bond->device == event_dev->master) { - switch (event) { - case NETDEV_UNREGISTER: - bond_release(this_bond->device, event_dev); - break; - } - return NOTIFY_DONE; - } - this_bond = this_bond->next_bond; - } - return NOTIFY_DONE; -} - -static struct notifier_block bond_netdev_notifier = { - notifier_call: bond_event, -}; - -static int __init bond_init(struct net_device *dev) -{ - bonding_t *bond, *this_bond, *last_bond; - int count; - -#ifdef BONDING_DEBUG - printk (KERN_INFO "Begin bond_init for %s\n", dev->name); -#endif - bond = kmalloc(sizeof(struct bonding), GFP_KERNEL); - if (bond == NULL) { - return -ENOMEM; - } - memset(bond, 0, sizeof(struct bonding)); - - /* initialize rwlocks */ - rwlock_init(&bond->lock); - rwlock_init(&bond->ptrlock); - - bond->stats = kmalloc(sizeof(struct net_device_stats), GFP_KERNEL); - if (bond->stats == NULL) { - kfree(bond); - return -ENOMEM; - } - memset(bond->stats, 0, sizeof(struct net_device_stats)); - - bond->next = bond->prev = (slave_t *)bond; - bond->current_slave = NULL; - bond->current_arp_slave = NULL; - bond->device = dev; - dev->priv = bond; - - /* Initialize the device structure. */ - switch (bond_mode) { - case BOND_MODE_ACTIVEBACKUP: - dev->hard_start_xmit = bond_xmit_activebackup; - break; - case BOND_MODE_ROUNDROBIN: - dev->hard_start_xmit = bond_xmit_roundrobin; - break; - case BOND_MODE_XOR: - dev->hard_start_xmit = bond_xmit_xor; - break; - case BOND_MODE_BROADCAST: - dev->hard_start_xmit = bond_xmit_broadcast; - break; - default: - printk(KERN_ERR "Unknown bonding mode %d\n", bond_mode); - kfree(bond->stats); - kfree(bond); - return -EINVAL; - } - - dev->get_stats = bond_get_stats; - dev->open = bond_open; - dev->stop = bond_close; - dev->set_multicast_list = set_multicast_list; - dev->do_ioctl = bond_ioctl; - - /* - * Fill in the fields of the device structure with ethernet-generic - * values. - */ - - ether_setup(dev); - - dev->tx_queue_len = 0; - dev->flags |= IFF_MASTER|IFF_MULTICAST; -#ifdef CONFIG_NET_FASTROUTE - dev->accept_fastpath = bond_accept_fastpath; -#endif - - printk(KERN_INFO "%s registered with", dev->name); - if (miimon > 0) { - printk(" MII link monitoring set to %d ms", miimon); - updelay /= miimon; - downdelay /= miimon; - } else { - printk("out MII link monitoring"); - } - printk(", in %s mode.\n", bond_mode_name()); - - printk(KERN_INFO "%s registered with", dev->name); - if (arp_interval > 0) { - printk(" ARP monitoring set to %d ms with %d target(s):", - arp_interval, arp_ip_count); - for (count=0 ; countbond_proc_dir = proc_mkdir(dev->name, proc_net); - if (bond->bond_proc_dir == NULL) { - printk(KERN_ERR "%s: Cannot init /proc/net/%s/\n", - dev->name, dev->name); - kfree(bond->stats); - kfree(bond); - return -ENOMEM; - } - bond->bond_proc_info_file = - create_proc_info_entry("info", 0, bond->bond_proc_dir, - bond_get_info); - if (bond->bond_proc_info_file == NULL) { - printk(KERN_ERR "%s: Cannot init /proc/net/%s/info\n", - dev->name, dev->name); - remove_proc_entry(dev->name, proc_net); - kfree(bond->stats); - kfree(bond); - return -ENOMEM; - } -#endif /* CONFIG_PROC_FS */ - - if (first_pass == 1) { - these_bonds = bond; - register_netdevice_notifier(&bond_netdev_notifier); - first_pass = 0; - } else { - last_bond = these_bonds; - this_bond = these_bonds->next_bond; - while (this_bond != NULL) { - last_bond = this_bond; - this_bond = this_bond->next_bond; - } - last_bond->next_bond = bond; - } - - return 0; -} - -/* -static int __init bond_probe(struct net_device *dev) -{ - bond_init(dev); - return 0; -} - */ - -/* - * Convert string input module parms. Accept either the - * number of the mode or its string name. - */ -static inline int -bond_parse_parm(char *mode_arg, struct bond_parm_tbl *tbl) -{ - int i; - - for (i = 0; tbl[i].modename != NULL; i++) { - if ((isdigit(*mode_arg) && - tbl[i].mode == simple_strtol(mode_arg, NULL, 0)) || - (0 == strncmp(mode_arg, tbl[i].modename, - strlen(tbl[i].modename)))) { - return tbl[i].mode; - } - } - - return -1; -} - - -static int __init bonding_init(void) -{ - int no; - int err; - - /* Find a name for this unit */ - static struct net_device *dev_bond = NULL; - - printk(KERN_INFO "%s", version); - - /* - * Convert string parameters. - */ - if (mode) { - bond_mode = bond_parse_parm(mode, bond_mode_tbl); - if (bond_mode == -1) { - printk(KERN_WARNING - "bonding_init(): Invalid bonding mode \"%s\"\n", - mode == NULL ? "NULL" : mode); - return -EINVAL; - } - } - - if (multicast) { - multicast_mode = bond_parse_parm(multicast, bond_mc_tbl); - if (multicast_mode == -1) { - printk(KERN_WARNING - "bonding_init(): Invalid multicast mode \"%s\"\n", - multicast == NULL ? "NULL" : multicast); - return -EINVAL; - } - } - - if (max_bonds < 1 || max_bonds > INT_MAX) { - printk(KERN_WARNING - "bonding_init(): max_bonds (%d) not in range %d-%d, " - "so it was reset to BOND_DEFAULT_MAX_BONDS (%d)", - max_bonds, 1, INT_MAX, BOND_DEFAULT_MAX_BONDS); - max_bonds = BOND_DEFAULT_MAX_BONDS; - } - dev_bond = dev_bonds = kmalloc(max_bonds*sizeof(struct net_device), - GFP_KERNEL); - if (dev_bond == NULL) { - return -ENOMEM; - } - memset(dev_bonds, 0, max_bonds*sizeof(struct net_device)); - - if (miimon < 0) { - printk(KERN_WARNING - "bonding_init(): miimon module parameter (%d), " - "not in range 0-%d, so it was reset to %d\n", - miimon, INT_MAX, BOND_LINK_MON_INTERV); - miimon = BOND_LINK_MON_INTERV; - } - - if (updelay < 0) { - printk(KERN_WARNING - "bonding_init(): updelay module parameter (%d), " - "not in range 0-%d, so it was reset to 0\n", - updelay, INT_MAX); - updelay = 0; - } - - if (downdelay < 0) { - printk(KERN_WARNING - "bonding_init(): downdelay module parameter (%d), " - "not in range 0-%d, so it was reset to 0\n", - downdelay, INT_MAX); - downdelay = 0; - } - - if (miimon == 0) { - if ((updelay != 0) || (downdelay != 0)) { - /* just warn the user the up/down delay will have - * no effect since miimon is zero... - */ - printk(KERN_WARNING - "bonding_init(): miimon module parameter not " - "set and updelay (%d) or downdelay (%d) module " - "parameter is set; updelay and downdelay have " - "no effect unless miimon is set\n", - updelay, downdelay); - } - } else { - /* don't allow arp monitoring */ - if (arp_interval != 0) { - printk(KERN_WARNING - "bonding_init(): miimon (%d) and arp_interval " - "(%d) can't be used simultaneously, " - "disabling ARP monitoring\n", - miimon, arp_interval); - arp_interval = 0; - } - - if ((updelay % miimon) != 0) { - /* updelay will be rounded in bond_init() when it - * is divided by miimon, we just inform user here - */ - printk(KERN_WARNING - "bonding_init(): updelay (%d) is not a multiple " - "of miimon (%d), updelay rounded to %d ms\n", - updelay, miimon, (updelay / miimon) * miimon); - } - - if ((downdelay % miimon) != 0) { - /* downdelay will be rounded in bond_init() when it - * is divided by miimon, we just inform user here - */ - printk(KERN_WARNING - "bonding_init(): downdelay (%d) is not a " - "multiple of miimon (%d), downdelay rounded " - "to %d ms\n", - downdelay, miimon, - (downdelay / miimon) * miimon); - } - } - - if (arp_interval < 0) { - printk(KERN_WARNING - "bonding_init(): arp_interval module parameter (%d), " - "not in range 0-%d, so it was reset to %d\n", - arp_interval, INT_MAX, BOND_LINK_ARP_INTERV); - arp_interval = BOND_LINK_ARP_INTERV; - } - - for (arp_ip_count=0 ; - (arp_ip_count < MAX_ARP_IP_TARGETS) && arp_ip_target[arp_ip_count]; - arp_ip_count++ ) { - /* TODO: check and log bad ip address */ - if (my_inet_aton(arp_ip_target[arp_ip_count], - &arp_target[arp_ip_count]) == 0) { - printk(KERN_WARNING - "bonding_init(): bad arp_ip_target module " - "parameter (%s), ARP monitoring will not be " - "performed\n", - arp_ip_target[arp_ip_count]); - arp_interval = 0; - } - } - - - if ( (arp_interval > 0) && (arp_ip_count==0)) { - /* don't allow arping if no arp_ip_target given... */ - printk(KERN_WARNING - "bonding_init(): arp_interval module parameter " - "(%d) specified without providing an arp_ip_target " - "parameter, arp_interval was reset to 0\n", - arp_interval); - arp_interval = 0; - } - - if ((miimon == 0) && (arp_interval == 0)) { - /* miimon and arp_interval not set, we need one so things - * work as expected, see bonding.txt for details - */ - printk(KERN_ERR - "bonding_init(): either miimon or " - "arp_interval and arp_ip_target module parameters " - "must be specified, otherwise bonding will not detect " - "link failures! see bonding.txt for details.\n"); - } - - if ((primary != NULL) && (bond_mode != BOND_MODE_ACTIVEBACKUP)){ - /* currently, using a primary only makes sence - * in active backup mode - */ - printk(KERN_WARNING - "bonding_init(): %s primary device specified but has " - " no effect in %s mode\n", - primary, bond_mode_name()); - primary = NULL; - } - - - for (no = 0; no < max_bonds; no++) { - dev_bond->init = bond_init; - - err = dev_alloc_name(dev_bond,"bond%d"); - if (err < 0) { - kfree(dev_bonds); - return err; - } - SET_MODULE_OWNER(dev_bond); - if (register_netdev(dev_bond) != 0) { - kfree(dev_bonds); - return -EIO; - } - dev_bond++; - } - return 0; -} - -static void __exit bonding_exit(void) -{ - struct net_device *dev_bond = dev_bonds; - struct bonding *bond; - int no; - - unregister_netdevice_notifier(&bond_netdev_notifier); - - for (no = 0; no < max_bonds; no++) { - -#ifdef CONFIG_PROC_FS - bond = (struct bonding *) dev_bond->priv; - remove_proc_entry("info", bond->bond_proc_dir); - remove_proc_entry(dev_bond->name, proc_net); -#endif - unregister_netdev(dev_bond); - kfree(bond->stats); - kfree(dev_bond->priv); - - dev_bond->priv = NULL; - dev_bond++; - } - kfree(dev_bonds); -} - -module_init(bonding_init); -module_exit(bonding_exit); -MODULE_LICENSE("GPL"); -MODULE_DESCRIPTION(DRV_DESCRIPTION ", v" DRV_VERSION); - -/* - * Local variables: - * c-indent-level: 8 - * c-basic-offset: 8 - * tab-width: 8 - * End: - */ diff -Nuarp linux-2.4.20-bonding-20030317/drivers/net/Makefile linux-2.4.20-bonding-20030317-devel/drivers/net/Makefile --- linux-2.4.20-bonding-20030317/drivers/net/Makefile 2003-03-18 17:03:29.000000000 +0200 +++ linux-2.4.20-bonding-20030317-devel/drivers/net/Makefile 2003-03-18 17:03:29.000000000 +0200 @@ -29,6 +29,10 @@ ifeq ($(CONFIG_E1000),y) obj-y += e1000/e1000.o endif +ifeq ($(CONFIG_BONDING),y) + obj-y += bonding/bonding.o +endif + ifeq ($(CONFIG_ISDN_PPP),y) obj-$(CONFIG_ISDN) += slhc.o endif @@ -46,6 +50,7 @@ subdir-$(CONFIG_SK98LIN) += sk98lin subdir-$(CONFIG_SKFP) += skfp subdir-$(CONFIG_E100) += e100 subdir-$(CONFIG_E1000) += e1000 +subdir-$(CONFIG_BONDING) += bonding # # link order important here @@ -157,7 +162,6 @@ endif obj-$(CONFIG_STRIP) += strip.o obj-$(CONFIG_DUMMY) += dummy.o -obj-$(CONFIG_BONDING) += bonding.o obj-$(CONFIG_DE600) += de600.o obj-$(CONFIG_DE620) += de620.o obj-$(CONFIG_AT1500) += lance.o -- | Shmulik Hen | | Israel Design Center (Jerusalem) | | LAN Access Division | | Intel Communications Group, Intel corp. | | | | Anti-Spam: shmulik dot hen at intel dot com | From hshmulik@intel.com Thu Mar 20 07:18:57 2003 Received: with ECARTIS (v1.0.0; list netdev); Thu, 20 Mar 2003 07:19:06 -0800 (PST) Received: from caduceus.fm.intel.com (fmr02.intel.com [192.55.52.25]) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h2KFIeq9026739 for ; Thu, 20 Mar 2003 07:18:40 -0800 Received: from talaria.fm.intel.com (talaria.fm.intel.com [10.1.192.39]) by caduceus.fm.intel.com (8.11.6/8.11.6/d: outer.mc,v 1.51 2002/09/23 20:43:23 dmccart Exp $) with ESMTP id h2KFBt608010 for ; Thu, 20 Mar 2003 15:11:59 GMT Received: from fmsmsxv040-1.fm.intel.com (fmsmsxvs040.fm.intel.com [132.233.42.124]) by talaria.fm.intel.com (8.11.6/8.11.6/d: inner.mc,v 1.28 2003/01/13 19:44:39 dmccart Exp $) with SMTP id h2KFJsc09831 for ; Thu, 20 Mar 2003 15:19:54 GMT Received: from jrslxjul4.npdj.intel.com ([10.12.254.188]) by fmsmsxv040-1.fm.intel.com (NAVGW 2.5.2.11) with SMTP id M2003032007183911683 ; Thu, 20 Mar 2003 07:18:41 -0800 Date: Thu, 20 Mar 2003 17:18:23 +0200 (IST) From: Shmulik Hen X-X-Sender: hshmulik@jrslxjul4.npdj.intel.com To: Bonding Developement list , Bonding Announce list , Linux Net Mailing list , Linux Kernel Mailing list , Oss SGI Netdev list , Jeff Garzik Subject: [patch] (8/8) Add 802.3ad support to bonding Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=ISO-8859-1 Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from QUOTED-PRINTABLE to 8bit by oss.sgi.com id h2KFIeq9026739 X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) X-archive-position: 1996 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: hshmulik@intel.com Precedence: bulk X-list: netdev Content-Length: 122965 Lines: 3413 This patch adds the actual code that does the IEEE 802.3ad dynamic link aggregation stuff. This mode offers the following advantages: automatic configuration, rapid configuration and reconfiguration, and deterministic behavior. It forms aggregation groups that include only members with full duplex and the same speed, and all adapters in the active aggregator simultaneously receive and transmit data. This patch is against bonding 2.4.20-20030317. diff -Nuarp linux-2.4.20-bonding-20030317/Documentation/networking/bonding.txt linux-2.4.20-bonding-20030317-devel/Documentation/networking/bonding.txt --- linux-2.4.20-bonding-20030317/Documentation/networking/bonding.txt 2003-03-18 17:24:24.000000000 +0200 +++ linux-2.4.20-bonding-20030317-devel/Documentation/networking/bonding.txt 2003-03-18 17:24:25.000000000 +0200 @@ -237,6 +237,11 @@ text or numeric option): Broadcast policy: transmits everything on all slave interfaces. This mode provides fault tolerance. + 802.3ad or 4 + IEEE 802.3ad Dynamic link aggregation. Creates aggregation + groups that share the same speed and duplex settings. + Transmits and receives on all slaves in the active aggregator. + miimon Specifies the frequency in milli-seconds that MII link monitoring will @@ -412,7 +417,7 @@ Switch Configuration While the switch does not need to be configured when the active-backup policy is used (mode=1), it does need to be configured for the round-robin, -XOR, and broadcast policies (mode=0, mode=2, and mode=3). +XOR, broadcast, and 802.3ad policies (mode=0, mode=2, mode=3, and mode=4). Verifying Bond Configuration @@ -445,7 +450,7 @@ parameters of mode=0 and miimon=1000 is The network configuration can be verified using the ifconfig command. In the example below, the bond0 interface is the master (MASTER) while eth0 and eth1 are slaves (SLAVE). Notice all slaves of bond0 have the same MAC address -(HWaddr) as bond0. +(HWaddr) as bond0 (except for 802.3ad mode). [root]# /sbin/ifconfig bond0 Link encap:Ethernet HWaddr 00:C0:F0:1F:37:B4 @@ -538,6 +543,13 @@ Frequently Asked Questions units. * Linux bonding, of course ! + In 802.3ad mode, it works with with systems that support IEEE 802.3ad + Dynamic Link Aggregation: + + * Extreme networks Summit 7i (look for link-aggregation). + * Cisco 6500 series (look for lacp). + * Foundry Big Iron 4000 + In active-backup mode, it should work with any Layer-II switche. @@ -591,6 +603,9 @@ Frequently Asked Questions Broadcast policy transmits everything on all slave interfaces. + 802.3ad, based on XOR but distributes traffic among all interfaces + in the active aggregator. + High Availability ================= diff -Nuarp linux-2.4.20-bonding-20030317/drivers/net/bonding/bond_3ad.c linux-2.4.20-bonding-20030317-devel/drivers/net/bonding/bond_3ad.c --- linux-2.4.20-bonding-20030317/drivers/net/bonding/bond_3ad.c 1970-01-01 02:00:00.000000000 +0200 +++ linux-2.4.20-bonding-20030317-devel/drivers/net/bonding/bond_3ad.c 2003-03-18 17:24:25.000000000 +0200 @@ -0,0 +1,2454 @@ +/**************************************************************************** + Copyright(c) 1999 - 2003 Intel Corporation. All rights reserved. + + This program is free software; you can redistribute it and/or modify it + under the terms of the GNU General Public License as published by the Free + Software Foundation; either version 2 of the License, or (at your option) + any later version. + + This program is distributed in the hope that it will be useful, but WITHOUT + ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or + FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for + more details. + + You should have received a copy of the GNU General Public License along with + this program; if not, write to the Free Software Foundation, Inc., 59 + Temple Place - Suite 330, Boston, MA 02111-1307, USA. + + The full GNU General Public License is included in this distribution in the + file called LICENSE. +*****************************************************************************/ + +#include +#include +#include +#include +#include +#include +#include "bonding.h" +#include "bond_3ad.h" + +// General definitions +#define AD_SHORT_TIMEOUT 1 +#define AD_LONG_TIMEOUT 0 +#define AD_STANDBY 0x2 +#define AD_MAX_TX_IN_SECOND 3 +#define AD_COLLECTOR_MAX_DELAY 0 + +// Timer definitions(43.4.4 in the 802.3ad standard) +#define AD_FAST_PERIODIC_TIME 1 +#define AD_SLOW_PERIODIC_TIME 30 +#define AD_SHORT_TIMEOUT_TIME (3*AD_FAST_PERIODIC_TIME) +#define AD_LONG_TIMEOUT_TIME (3*AD_SLOW_PERIODIC_TIME) +#define AD_CHURN_DETECTION_TIME 60 +#define AD_AGGREGATE_WAIT_TIME 2 + +// Port state definitions(43.4.2.2 in the 802.3ad standard) +#define AD_STATE_LACP_ACTIVITY 0x1 +#define AD_STATE_LACP_TIMEOUT 0x2 +#define AD_STATE_AGGREGATION 0x4 +#define AD_STATE_SYNCHRONIZATION 0x8 +#define AD_STATE_COLLECTING 0x10 +#define AD_STATE_DISTRIBUTING 0x20 +#define AD_STATE_DEFAULTED 0x40 +#define AD_STATE_EXPIRED 0x80 + +// Port Variables definitions used by the State Machines(43.4.7 in the 802.3ad standard) +#define AD_PORT_BEGIN 0x1 +#define AD_PORT_LACP_ENABLED 0x2 +#define AD_PORT_ACTOR_CHURN 0x4 +#define AD_PORT_PARTNER_CHURN 0x8 +#define AD_PORT_READY 0x10 +#define AD_PORT_READY_N 0x20 +#define AD_PORT_MATCHED 0x40 +#define AD_PORT_STANDBY 0x80 +#define AD_PORT_SELECTED 0x100 +#define AD_PORT_MOVED 0x200 + +// Port Key definitions +// key is determined according to the link speed, duplex and +// user key(which is yet not supported) +// ------------------------------------------------------------ +// Port key : | User key | Speed |Duplex| +// ------------------------------------------------------------ +// 16 6 1 0 +#define AD_DUPLEX_KEY_BITS 0x1 +#define AD_SPEED_KEY_BITS 0x3E +#define AD_USER_KEY_BITS 0xFFC0 + +//dalloun +#define AD_LINK_SPEED_BITMASK_1MBPS 0x1 +#define AD_LINK_SPEED_BITMASK_10MBPS 0x2 +#define AD_LINK_SPEED_BITMASK_100MBPS 0x4 +#define AD_LINK_SPEED_BITMASK_1000MBPS 0x8 +//endalloun + +// compare MAC addresses +#define MAC_ADDRESS_COMPARE(A, B) memcmp(A, B, ETH_ALEN) + +static struct mac_addr null_mac_addr = {{0, 0, 0, 0, 0, 0}}; +static u16 ad_ticks_per_sec; + +// ================= 3AD api to bonding and kernel code ================== +static u16 __get_link_speed(struct port *port); +static u8 __get_duplex(struct port *port); +static inline void __initialize_port_locks(struct port *port); +static inline void __deinitialize_port_locks(struct port *port); +//conversions +static void __ntohs_lacpdu(struct lacpdu *lacpdu); +static u16 __ad_timer_to_ticks(u16 timer_type, u16 Par); + + +// ================= ad code helper functions ================== +//needed by ad_rx_machine(...) +static void __record_pdu(struct lacpdu *lacpdu, struct port *port); +static void __record_default(struct port *port); +static void __update_selected(struct lacpdu *lacpdu, struct port *port); +static void __update_default_selected(struct port *port); +static void __choose_matched(struct lacpdu *lacpdu, struct port *port); +static void __update_ntt(struct lacpdu *lacpdu, struct port *port); + +//needed for ad_mux_machine(..) +static void __attach_bond_to_agg(struct port *port); +static void __detach_bond_from_agg(struct port *port); +static int __agg_ports_are_ready(struct aggregator *aggregator); +static void __set_agg_ports_ready(struct aggregator *aggregator, int val); + +//needed for ad_agg_selection_logic(...) +static u32 __get_agg_bandwidth(struct aggregator *aggregator); +static struct aggregator *__get_active_agg(struct aggregator *aggregator); + + +// ================= main 802.3ad protocol functions ================== +static int ad_lacpdu_send(struct port *port); +static int ad_marker_send(struct port *port, struct marker *marker); +static void ad_mux_machine(struct port *port); +static void ad_rx_machine(struct lacpdu *lacpdu, struct port *port); +static void ad_tx_machine(struct port *port); +static void ad_periodic_machine(struct port *port); +static void ad_port_selection_logic(struct port *port); +static void ad_agg_selection_logic(struct aggregator *aggregator); +static void ad_clear_agg(struct aggregator *aggregator); +static void ad_initialize_agg(struct aggregator *aggregator); +static void ad_initialize_port(struct port *port); +static void ad_initialize_lacpdu(struct lacpdu *Lacpdu); +static void ad_enable_collecting_distributing(struct port *port); +static void ad_disable_collecting_distributing(struct port *port); +static void ad_marker_info_received(struct marker *marker_info, struct port *port); +static void ad_marker_response_received(struct marker *marker, struct port *port); + + +///////////////////////////////////////////////////////////////////////////////// +// ================= api to bonding and kernel code ================== +///////////////////////////////////////////////////////////////////////////////// + +/** + * __get_bond_by_port - get the port's bonding struct + * @port: the port we're looking at + * + * Return @port's bonding struct, or %NULL if it can't be found. + */ +static inline struct bonding *__get_bond_by_port(struct port *port) +{ + if (port->slave == NULL) { + return NULL; + } + + return bond_get_bond_by_slave(port->slave); +} + +/** + * __get_first_port - get the first port in the bond + * @bond: the bond we're looking at + * + * Return the port of the first slave in @bond, or %NULL if it can't be found. + */ +static inline struct port *__get_first_port(struct bonding *bond) +{ + struct slave *slave = bond->next; + + if (slave == (struct slave *)bond) { + return NULL; + } + + return &(SLAVE_AD_INFO(slave).port); +} + +/** + * __get_next_port - get the next port in the bond + * @port: the port we're looking at + * + * Return the port of the slave that is next in line of @port's slave in the + * bond, or %NULL if it can't be found. + */ +static inline struct port *__get_next_port(struct port *port) +{ + struct bonding *bond = __get_bond_by_port(port); + struct slave *slave = port->slave; + + // If there's no bond for this port, or this is the last slave + if ((bond == NULL) || (slave->next == bond->next)) { + return NULL; + } + + return &(SLAVE_AD_INFO(slave->next).port); +} + +/** + * __get_first_agg - get the first aggregator in the bond + * @bond: the bond we're looking at + * + * Return the aggregator of the first slave in @bond, or %NULL if it can't be + * found. + */ +static inline struct aggregator *__get_first_agg(struct port *port) +{ + struct bonding *bond = __get_bond_by_port(port); + + // If there's no bond for this port, or this is the last slave + if ((bond == NULL) || (bond->next == (struct slave *)bond)) { + return NULL; + } + + return &(SLAVE_AD_INFO(bond->next).aggregator); +} + +/** + * __get_next_agg - get the next aggregator in the bond + * @aggregator: the aggregator we're looking at + * + * Return the aggregator of the slave that is next in line of @aggregator's + * slave in the bond, or %NULL if it can't be found. + */ +static inline struct aggregator *__get_next_agg(struct aggregator *aggregator) +{ + struct slave *slave = aggregator->slave; + struct bonding *bond = bond_get_bond_by_slave(slave); + + // If there's no bond for this aggregator, or this is the last slave + if ((bond == NULL) || (slave->next == bond->next)) { + return NULL; + } + + return &(SLAVE_AD_INFO(slave->next).aggregator); +} + +/** + * __disable_port - disable the port's slave + * @port: the port we're looking at + * + */ +static inline void __disable_port(struct port *port) +{ + bond_set_slave_inactive_flags(port->slave); +} + +/** + * __enable_port - enable the port's slave, if it's up + * @port: the port we're looking at + * + */ +static inline void __enable_port(struct port *port) +{ + struct slave *slave = port->slave; + + if ((slave->link == BOND_LINK_UP) && IS_UP(slave->dev)) { + bond_set_slave_active_flags(slave); + } +} + +/** + * __port_is_enabled - check if the port's slave is in active state + * @port: the port we're looking at + * + */ +static inline int __port_is_enabled(struct port *port) +{ + return(port->slave->state == BOND_STATE_ACTIVE); +} + +/** + * __get_agg_selection_mode - get the aggregator selection mode + * @port: the port we're looking at + * + * Get the aggregator selection mode. Can be %BANDWIDTH or %COUNT. + */ +static inline u32 __get_agg_selection_mode(struct port *port) +{ + struct bonding *bond = __get_bond_by_port(port); + + if (bond == NULL) { + return AD_BANDWIDTH; + } + + return BOND_AD_INFO(bond).agg_select_mode; +} + +/** + * __check_agg_selection_timer - check if the selection timer has expired + * @port: the port we're looking at + * + */ +static inline int __check_agg_selection_timer(struct port *port) +{ + struct bonding *bond = __get_bond_by_port(port); + + if (bond == NULL) { + return 0; + } + + return BOND_AD_INFO(bond).agg_select_timer ? 1 : 0; +} + +/** + * __get_rx_machine_lock - lock the port's RX machine + * @port: the port we're looking at + * + */ +static inline void __get_rx_machine_lock(struct port *port) +{ + spin_lock(&(SLAVE_AD_INFO(port->slave).rx_machine_lock)); +} + +/** + * __release_rx_machine_lock - unlock the port's RX machine + * @port: the port we're looking at + * + */ +static inline void __release_rx_machine_lock(struct port *port) +{ + spin_unlock(&(SLAVE_AD_INFO(port->slave).rx_machine_lock)); +} + +/** + * __get_link_speed - get a port's speed + * @port: the port we're looking at + * + * Return @port's speed in 802.3ad bitmask format. i.e. one of: + * 0, + * %AD_LINK_SPEED_BITMASK_10MBPS, + * %AD_LINK_SPEED_BITMASK_100MBPS, + * %AD_LINK_SPEED_BITMASK_1000MBPS + */ +static u16 __get_link_speed(struct port *port) +{ + struct slave *slave = port->slave; + u16 speed; + + /* this if covers only a special case: when the configuration starts with + * link down, it sets the speed to 0. + * This is done in spite of the fact that the e100 driver reports 0 to be + * compatible with MVT in the future.*/ + if (slave->link != BOND_LINK_UP) { + speed=0; + } else { + switch (slave->speed) { + case SPEED_10: + speed = AD_LINK_SPEED_BITMASK_10MBPS; + break; + + case SPEED_100: + speed = AD_LINK_SPEED_BITMASK_100MBPS; + break; + + case SPEED_1000: + speed = AD_LINK_SPEED_BITMASK_1000MBPS; + break; + + default: + speed = 0; // unknown speed value from ethtool. shouldn't happen + break; + } + } + + BOND_PRINT_DBG(("Port %d Received link speed %d update from adapter", port->actor_port_number, speed)); + return speed; +} + +/** + * __get_duplex - get a port's duplex + * @port: the port we're looking at + * + * Return @port's duplex in 802.3ad bitmask format. i.e.: + * 0x01 if in full duplex + * 0x00 otherwise + */ +static u8 __get_duplex(struct port *port) +{ + struct slave *slave = port->slave; + + u8 retval; + + // handling a special case: when the configuration starts with + // link down, it sets the duplex to 0. + if (slave->link != BOND_LINK_UP) { + retval=0x0; + } else { + switch (slave->duplex) { + case DUPLEX_FULL: + retval=0x1; + BOND_PRINT_DBG(("Port %d Received status full duplex update from adapter", port->actor_port_number)); + break; + case DUPLEX_HALF: + default: + retval=0x0; + BOND_PRINT_DBG(("Port %d Received status NOT full duplex update from adapter", port->actor_port_number)); + break; + } + } + return retval; +} + +/** + * __initialize_port_locks - initialize a port's RX machine spinlock + * @port: the port we're looking at + * + */ +static inline void __initialize_port_locks(struct port *port) +{ + // make sure it isn't called twice + spin_lock_init(&(SLAVE_AD_INFO(port->slave).rx_machine_lock)); +} + +/** + * __deinitialize_port_locks - deinitialize a port's RX machine spinlock + * @port: the port we're looking at + * + */ +static inline void __deinitialize_port_locks(struct port *port) +{ +} + +//conversions +/** + * __ntohs_lacpdu - convert the contents of a LACPDU to host byte order + * @lacpdu: the speicifed lacpdu + * + * For each multi-byte field in the lacpdu, convert its content + */ +static void __ntohs_lacpdu(struct lacpdu *lacpdu) +{ + if (lacpdu) { + lacpdu->actor_system_priority = ntohs(lacpdu->actor_system_priority); + lacpdu->actor_key = ntohs(lacpdu->actor_key); + lacpdu->actor_port_priority = ntohs(lacpdu->actor_port_priority); + lacpdu->actor_port = ntohs(lacpdu->actor_port); + lacpdu->partner_system_priority = ntohs(lacpdu->partner_system_priority); + lacpdu->partner_key = ntohs(lacpdu->partner_key); + lacpdu->partner_port_priority = ntohs(lacpdu->partner_port_priority); + lacpdu->partner_port = ntohs(lacpdu->partner_port); + lacpdu->collector_max_delay = ntohs(lacpdu->collector_max_delay); + } +} + +/** + * __ad_timer_to_ticks - convert a given timer type to AD module ticks + * @timer_type: which timer to operate + * @par: timer parameter. see below + * + * If @timer_type is %current_while_timer, @par indicates long/short timer. + * If @timer_type is %periodic_timer, @par is one of %FAST_PERIODIC_TIME, + * %SLOW_PERIODIC_TIME. + */ +static u16 __ad_timer_to_ticks(u16 timer_type, u16 par) +{ + u16 retval=0; //to silence the compiler + + switch (timer_type) { + case AD_CURRENT_WHILE_TIMER: // for rx machine usage + if (par) { // for short or long timeout + retval = (AD_SHORT_TIMEOUT_TIME*ad_ticks_per_sec); // short timeout + } else { + retval = (AD_LONG_TIMEOUT_TIME*ad_ticks_per_sec); // long timeout + } + break; + case AD_ACTOR_CHURN_TIMER: // for local churn machine + retval = (AD_CHURN_DETECTION_TIME*ad_ticks_per_sec); + break; + case AD_PERIODIC_TIMER: // for periodic machine + retval = (par*ad_ticks_per_sec); // long timeout + break; + case AD_PARTNER_CHURN_TIMER: // for remote churn machine + retval = (AD_CHURN_DETECTION_TIME*ad_ticks_per_sec); + break; + case AD_WAIT_WHILE_TIMER: // for selection machine + retval = (AD_AGGREGATE_WAIT_TIME*ad_ticks_per_sec); + break; + } + return retval; +} + + +///////////////////////////////////////////////////////////////////////////////// +// ================= ad_rx_machine helper functions ================== +///////////////////////////////////////////////////////////////////////////////// + +/** + * __record_pdu - record parameters from a received lacpdu + * @lacpdu: the lacpdu we've received + * @port: the port we're looking at + * + * Record the parameter values for the Actor carried in a received lacpdu as + * the current partner operational parameter values and sets + * actor_oper_port_state.defaulted to FALSE. + */ +static void __record_pdu(struct lacpdu *lacpdu, struct port *port) +{ + // validate lacpdu and port + if (lacpdu && port) { + // record the new parameter values for the partner operational + port->partner_oper_port_number = lacpdu->actor_port; + port->partner_oper_port_priority = lacpdu->actor_port_priority; + port->partner_oper_system = lacpdu->actor_system; + port->partner_oper_system_priority = lacpdu->actor_system_priority; + port->partner_oper_key = lacpdu->actor_key; + // zero partener's lase states + port->partner_oper_port_state = 0; + port->partner_oper_port_state |= (lacpdu->actor_state & AD_STATE_LACP_ACTIVITY); + port->partner_oper_port_state |= (lacpdu->actor_state & AD_STATE_LACP_TIMEOUT); + port->partner_oper_port_state |= (lacpdu->actor_state & AD_STATE_AGGREGATION); + port->partner_oper_port_state |= (lacpdu->actor_state & AD_STATE_SYNCHRONIZATION); + port->partner_oper_port_state |= (lacpdu->actor_state & AD_STATE_COLLECTING); + port->partner_oper_port_state |= (lacpdu->actor_state & AD_STATE_DISTRIBUTING); + port->partner_oper_port_state |= (lacpdu->actor_state & AD_STATE_DEFAULTED); + port->partner_oper_port_state |= (lacpdu->actor_state & AD_STATE_EXPIRED); + + // set actor_oper_port_state.defaulted to FALSE + port->actor_oper_port_state &= ~AD_STATE_DEFAULTED; + + // set the partner sync. to on if the partner is sync. and the port is matched + if ((port->sm_vars & AD_PORT_MATCHED) && (lacpdu->actor_state & AD_STATE_SYNCHRONIZATION)) { + port->partner_oper_port_state |= AD_STATE_SYNCHRONIZATION; + } else { + port->partner_oper_port_state &= ~AD_STATE_SYNCHRONIZATION; + } + } +} + +/** + * __record_default - record default parameters + * @port: the port we're looking at + * + * This function records the default parameter values for the partner carried + * in the Partner Admin parameters as the current partner operational parameter + * values and sets actor_oper_port_state.defaulted to TRUE. + */ +static void __record_default(struct port *port) +{ + // validate the port + if (port) { + // record the partner admin parameters + port->partner_oper_port_number = port->partner_admin_port_number; + port->partner_oper_port_priority = port->partner_admin_port_priority; + port->partner_oper_system = port->partner_admin_system; + port->partner_oper_system_priority = port->partner_admin_system_priority; + port->partner_oper_key = port->partner_admin_key; + port->partner_oper_port_state = port->partner_admin_port_state; + + // set actor_oper_port_state.defaulted to true + port->actor_oper_port_state |= AD_STATE_DEFAULTED; + } +} + +/** + * __update_selected - update a port's Selected variable from a received lacpdu + * @lacpdu: the lacpdu we've received + * @port: the port we're looking at + * + * Update the value of the selected variable, using parameter values from a + * newly received lacpdu. The parameter values for the Actor carried in the + * received PDU are compared with the corresponding operational parameter + * values for the ports partner. If one or more of the comparisons shows that + * the value(s) received in the PDU differ from the current operational values, + * then selected is set to FALSE and actor_oper_port_state.synchronization is + * set to out_of_sync. Otherwise, selected remains unchanged. + */ +static void __update_selected(struct lacpdu *lacpdu, struct port *port) +{ + // validate lacpdu and port + if (lacpdu && port) { + // check if any parameter is different + if ((lacpdu->actor_port != port->partner_oper_port_number) || + (lacpdu->actor_port_priority != port->partner_oper_port_priority) || + MAC_ADDRESS_COMPARE(&(lacpdu->actor_system), &(port->partner_oper_system)) || + (lacpdu->actor_system_priority != port->partner_oper_system_priority) || + (lacpdu->actor_key != port->partner_oper_key) || + ((lacpdu->actor_state & AD_STATE_AGGREGATION) != (port->partner_oper_port_state & AD_STATE_AGGREGATION)) + ) { + // update the state machine Selected variable + port->sm_vars &= ~AD_PORT_SELECTED; + } + } +} + +/** + * __update_default_selected - update a port's Selected variable from Partner + * @port: the port we're looking at + * + * This function updates the value of the selected variable, using the partner + * administrative parameter values. The administrative values are compared with + * the corresponding operational parameter values for the partner. If one or + * more of the comparisons shows that the administrative value(s) differ from + * the current operational values, then Selected is set to FALSE and + * actor_oper_port_state.synchronization is set to OUT_OF_SYNC. Otherwise, + * Selected remains unchanged. + */ +static void __update_default_selected(struct port *port) +{ + // validate the port + if (port) { + // check if any parameter is different + if ((port->partner_admin_port_number != port->partner_oper_port_number) || + (port->partner_admin_port_priority != port->partner_oper_port_priority) || + MAC_ADDRESS_COMPARE(&(port->partner_admin_system), &(port->partner_oper_system)) || + (port->partner_admin_system_priority != port->partner_oper_system_priority) || + (port->partner_admin_key != port->partner_oper_key) || + ((port->partner_admin_port_state & AD_STATE_AGGREGATION) != (port->partner_oper_port_state & AD_STATE_AGGREGATION)) + ) { + // update the state machine Selected variable + port->sm_vars &= ~AD_PORT_SELECTED; + } + } +} + +/** + * __choose_matched - update a port's matched variable from a received lacpdu + * @lacpdu: the lacpdu we've received + * @port: the port we're looking at + * + * Update the value of the matched variable, using parameter values from a + * newly received lacpdu. Parameter values for the partner carried in the + * received PDU are compared with the corresponding operational parameter + * values for the actor. Matched is set to TRUE if all of these parameters + * match and the PDU parameter partner_state.aggregation has the same value as + * actor_oper_port_state.aggregation and lacp will actively maintain the link + * in the aggregation. Matched is also set to TRUE if the value of + * actor_state.aggregation in the received PDU is set to FALSE, i.e., indicates + * an individual link and lacp will actively maintain the link. Otherwise, + * matched is set to FALSE. LACP is considered to be actively maintaining the + * link if either the PDU's actor_state.lacp_activity variable is TRUE or both + * the actor's actor_oper_port_state.lacp_activity and the PDU's + * partner_state.lacp_activity variables are TRUE. + */ +static void __choose_matched(struct lacpdu *lacpdu, struct port *port) +{ + // validate lacpdu and port + if (lacpdu && port) { + // check if all parameters are alike + if (((lacpdu->partner_port == port->actor_port_number) && + (lacpdu->partner_port_priority == port->actor_port_priority) && + !MAC_ADDRESS_COMPARE(&(lacpdu->partner_system), &(port->actor_system)) && + (lacpdu->partner_system_priority == port->actor_system_priority) && + (lacpdu->partner_key == port->actor_oper_port_key) && + ((lacpdu->partner_state & AD_STATE_AGGREGATION) == (port->actor_oper_port_state & AD_STATE_AGGREGATION))) || + // or this is individual link(aggregation == FALSE) + ((lacpdu->actor_state & AD_STATE_AGGREGATION) == 0) + ) { + // update the state machine Matched variable + port->sm_vars |= AD_PORT_MATCHED; + } else { + port->sm_vars &= ~AD_PORT_MATCHED; + } + } +} + +/** + * __update_ntt - update a port's ntt variable from a received lacpdu + * @lacpdu: the lacpdu we've received + * @port: the port we're looking at + * + * Updates the value of the ntt variable, using parameter values from a newly + * received lacpdu. The parameter values for the partner carried in the + * received PDU are compared with the corresponding operational parameter + * values for the Actor. If one or more of the comparisons shows that the + * value(s) received in the PDU differ from the current operational values, + * then ntt is set to TRUE. Otherwise, ntt remains unchanged. + */ +static void __update_ntt(struct lacpdu *lacpdu, struct port *port) +{ + // validate lacpdu and port + if (lacpdu && port) { + // check if any parameter is different + if ((lacpdu->partner_port != port->actor_port_number) || + (lacpdu->partner_port_priority != port->actor_port_priority) || + MAC_ADDRESS_COMPARE(&(lacpdu->partner_system), &(port->actor_system)) || + (lacpdu->partner_system_priority != port->actor_system_priority) || + (lacpdu->partner_key != port->actor_oper_port_key) || + ((lacpdu->partner_state & AD_STATE_LACP_ACTIVITY) != (port->actor_oper_port_state & AD_STATE_LACP_ACTIVITY)) || + ((lacpdu->partner_state & AD_STATE_LACP_TIMEOUT) != (port->actor_oper_port_state & AD_STATE_LACP_TIMEOUT)) || + ((lacpdu->partner_state & AD_STATE_SYNCHRONIZATION) != (port->actor_oper_port_state & AD_STATE_SYNCHRONIZATION)) || + ((lacpdu->partner_state & AD_STATE_AGGREGATION) != (port->actor_oper_port_state & AD_STATE_AGGREGATION)) + ) { + // set ntt to be TRUE + port->ntt = 1; + } + } +} + +/** + * __attach_bond_to_agg + * @port: the port we're looking at + * + * Handle the attaching of the port's control parser/multiplexer and the + * aggregator. This function does nothing since the parser/multiplexer of the + * receive and the parser/multiplexer of the aggregator are already combined. + */ +static void __attach_bond_to_agg(struct port *port) +{ + port=NULL; // just to satisfy the compiler + // This function does nothing since the parser/multiplexer of the receive + // and the parser/multiplexer of the aggregator are already combined +} + +/** + * __detach_bond_to_agg + * @port: the port we're looking at + * + * Handle the detaching of the port's control parser/multiplexer from the + * aggregator. This function does nothing since the parser/multiplexer of the + * receive and the parser/multiplexer of the aggregator are already combined. + */ +static void __detach_bond_from_agg(struct port *port) +{ + port=NULL; // just to satisfy the compiler + // This function does nothing sience the parser/multiplexer of the receive + // and the parser/multiplexer of the aggregator are already combined +} + +/** + * __agg_ports_are_ready - check if all ports in an aggregator are ready + * @aggregator: the aggregator we're looking at + * + */ +static int __agg_ports_are_ready(struct aggregator *aggregator) +{ + struct port *port; + int retval = 1; + + if (aggregator) { + // scan all ports in this aggregator to verfy if they are all ready + for (port=aggregator->lag_ports; port; port=port->next_port_in_aggregator) { + if (!(port->sm_vars & AD_PORT_READY_N)) { + retval = 0; + break; + } + } + } + + return retval; +} + +/** + * __set_agg_ports_ready - set value of Ready bit in all ports of an aggregator + * @aggregator: the aggregator we're looking at + * @val: Should the ports' ready bit be set on or off + * + */ +static void __set_agg_ports_ready(struct aggregator *aggregator, int val) +{ + struct port *port; + + for (port=aggregator->lag_ports; port; port=port->next_port_in_aggregator) { + if (val) { + port->sm_vars |= AD_PORT_READY; + } else { + port->sm_vars &= ~AD_PORT_READY; + } + } +} + +/** + * __get_agg_bandwidth - get the total bandwidth of an aggregator + * @aggregator: the aggregator we're looking at + * + */ +static u32 __get_agg_bandwidth(struct aggregator *aggregator) +{ + u32 bandwidth=0; + u32 basic_speed; + + if (aggregator->num_of_ports) { + basic_speed = __get_link_speed(aggregator->lag_ports); + switch (basic_speed) { + case AD_LINK_SPEED_BITMASK_1MBPS: + bandwidth = aggregator->num_of_ports; + break; + case AD_LINK_SPEED_BITMASK_10MBPS: + bandwidth = aggregator->num_of_ports * 10; + break; + case AD_LINK_SPEED_BITMASK_100MBPS: + bandwidth = aggregator->num_of_ports * 100; + break; + case AD_LINK_SPEED_BITMASK_1000MBPS: + bandwidth = aggregator->num_of_ports * 1000; + break; + default: + bandwidth=0; // to silent the compilor .... + } + } + return bandwidth; +} + +/** + * __get_active_agg - get the current active aggregator + * @aggregator: the aggregator we're looking at + * + */ +static struct aggregator *__get_active_agg(struct aggregator *aggregator) +{ + struct aggregator *retval = NULL; + + for (; aggregator; aggregator = __get_next_agg(aggregator)) { + if (aggregator->is_active) { + retval = aggregator; + break; + } + } + + return retval; +} + +////////////////////////////////////////////////////////////////////////////////////// +// ================= main 802.3ad protocol code ====================================== +////////////////////////////////////////////////////////////////////////////////////// + +/** + * ad_lacpdu_send - send out a lacpdu packet on a given port + * @port: the port we're looking at + * + * Returns: 0 on success + * < 0 on error + */ +static int ad_lacpdu_send(struct port *port) +{ + struct slave *slave = port->slave; + struct sk_buff *skb; + struct lacpdu_header *lacpdu_header; + int length = sizeof(struct lacpdu_header); + struct mac_addr lacpdu_multicast_address = AD_MULTICAST_LACPDU_ADDR; + + skb = dev_alloc_skb(length); + if (!skb) { + return -ENOMEM; + } + + skb->dev = slave->dev; + skb->mac.raw = skb->data; + skb->nh.raw = skb->data + ETH_HLEN; + skb->protocol = PKT_TYPE_LACPDU; + + lacpdu_header = (struct lacpdu_header *)skb_put(skb, length); + + lacpdu_header->ad_header.destination_address = lacpdu_multicast_address; + /* Note: source addres is set to be the member's PERMANENT address, because we use it + to identify loopback lacpdus in receive. */ + lacpdu_header->ad_header.source_address = *((struct mac_addr *)(slave->perm_hwaddr)); + lacpdu_header->ad_header.length_type = PKT_TYPE_LACPDU; + + lacpdu_header->lacpdu = port->lacpdu; // struct copy + + dev_queue_xmit(skb); + + return 0; +} + +/** + * ad_marker_send - send marker information/response on a given port + * @port: the port we're looking at + * @marker: marker data to send + * + * Returns: 0 on success + * < 0 on error + */ +static int ad_marker_send(struct port *port, struct marker *marker) +{ + struct slave *slave = port->slave; + struct sk_buff *skb; + struct marker_header *marker_header; + int length = sizeof(struct marker_header); + struct mac_addr lacpdu_multicast_address = AD_MULTICAST_LACPDU_ADDR; + + skb = dev_alloc_skb(length + 16); + if (!skb) { + return -ENOMEM; + } + + skb_reserve(skb, 16); + + skb->dev = slave->dev; + skb->mac.raw = skb->data; + skb->nh.raw = skb->data + ETH_HLEN; + skb->protocol = PKT_TYPE_LACPDU; + + marker_header = (struct marker_header *)skb_put(skb, length); + + marker_header->ad_header.destination_address = lacpdu_multicast_address; + /* Note: source addres is set to be the member's PERMANENT address, because we use it + to identify loopback MARKERs in receive. */ + marker_header->ad_header.source_address = *((struct mac_addr *)(slave->perm_hwaddr)); + marker_header->ad_header.length_type = PKT_TYPE_LACPDU; + + marker_header->marker = *marker; // struct copy + + dev_queue_xmit(skb); + + return 0; +} + +/** + * ad_mux_machine - handle a port's mux state machine + * @port: the port we're looking at + * + */ +static void ad_mux_machine(struct port *port) +{ + mux_states_t last_state; + + // keep current State Machine state to compare later if it was changed + last_state = port->sm_mux_state; + + if (port->sm_vars & AD_PORT_BEGIN) { + port->sm_mux_state = AD_MUX_DETACHED; // next state + } else { + switch (port->sm_mux_state) { + case AD_MUX_DETACHED: + if ((port->sm_vars & AD_PORT_SELECTED) || (port->sm_vars & AD_PORT_STANDBY)) { // if SELECTED or STANDBY + port->sm_mux_state = AD_MUX_WAITING; // next state + } + break; + case AD_MUX_WAITING: + // if SELECTED == FALSE return to DETACH state + if (!(port->sm_vars & AD_PORT_SELECTED)) { // if UNSELECTED + port->sm_vars &= ~AD_PORT_READY_N; + // in order to withhold the Selection Logic to check all ports READY_N value + // every callback cycle to update ready variable, we check READY_N and update READY here + __set_agg_ports_ready(port->aggregator, __agg_ports_are_ready(port->aggregator)); + port->sm_mux_state = AD_MUX_DETACHED; // next state + break; + } + + // check if the wait_while_timer expired + if (port->sm_mux_timer_counter && !(--port->sm_mux_timer_counter)) { + port->sm_vars |= AD_PORT_READY_N; + } + + // in order to withhold the selection logic to check all ports READY_N value + // every callback cycle to update ready variable, we check READY_N and update READY here + __set_agg_ports_ready(port->aggregator, __agg_ports_are_ready(port->aggregator)); + + // if the wait_while_timer expired, and the port is in READY state, move to ATTACHED state + if ((port->sm_vars & AD_PORT_READY) && !port->sm_mux_timer_counter) { + port->sm_mux_state = AD_MUX_ATTACHED; // next state + } + break; + case AD_MUX_ATTACHED: + // check also if agg_select_timer expired(so the edable port will take place only after this timer) + if ((port->sm_vars & AD_PORT_SELECTED) && (port->partner_oper_port_state & AD_STATE_SYNCHRONIZATION) && !__check_agg_selection_timer(port)) { + port->sm_mux_state = AD_MUX_COLLECTING_DISTRIBUTING;// next state + } else if (!(port->sm_vars & AD_PORT_SELECTED) || (port->sm_vars & AD_PORT_STANDBY)) { // if UNSELECTED or STANDBY + port->sm_vars &= ~AD_PORT_READY_N; + // in order to withhold the selection logic to check all ports READY_N value + // every callback cycle to update ready variable, we check READY_N and update READY here + __set_agg_ports_ready(port->aggregator, __agg_ports_are_ready(port->aggregator)); + port->sm_mux_state = AD_MUX_DETACHED;// next state + } + break; + case AD_MUX_COLLECTING_DISTRIBUTING: + if (!(port->sm_vars & AD_PORT_SELECTED) || (port->sm_vars & AD_PORT_STANDBY) || + !(port->partner_oper_port_state & AD_STATE_SYNCHRONIZATION) + ) { + port->sm_mux_state = AD_MUX_ATTACHED;// next state + + } else { + // if port state hasn't changed make + // sure that a collecting distributing + // port in an active aggregator is enabled + if (port->aggregator && + port->aggregator->is_active && + !__port_is_enabled(port)) { + + __enable_port(port); + } + } + break; + default: //to silence the compiler + break; + } + } + + // check if the state machine was changed + if (port->sm_mux_state != last_state) { + BOND_PRINT_DBG(("Mux Machine: Port=%d, Last State=%d, Curr State=%d", port->actor_port_number, last_state, port->sm_mux_state)); + switch (port->sm_mux_state) { + case AD_MUX_DETACHED: + __detach_bond_from_agg(port); + port->actor_oper_port_state &= ~AD_STATE_SYNCHRONIZATION; + ad_disable_collecting_distributing(port); + port->actor_oper_port_state &= ~AD_STATE_COLLECTING; + port->actor_oper_port_state &= ~AD_STATE_DISTRIBUTING; + port->ntt = 1; + break; + case AD_MUX_WAITING: + port->sm_mux_timer_counter = __ad_timer_to_ticks(AD_WAIT_WHILE_TIMER, 0); + break; + case AD_MUX_ATTACHED: + __attach_bond_to_agg(port); + port->actor_oper_port_state |= AD_STATE_SYNCHRONIZATION; + port->actor_oper_port_state &= ~AD_STATE_COLLECTING; + port->actor_oper_port_state &= ~AD_STATE_DISTRIBUTING; + ad_disable_collecting_distributing(port); + port->ntt = 1; + break; + case AD_MUX_COLLECTING_DISTRIBUTING: + port->actor_oper_port_state |= AD_STATE_COLLECTING; + port->actor_oper_port_state |= AD_STATE_DISTRIBUTING; + ad_enable_collecting_distributing(port); + port->ntt = 1; + break; + default: //to silence the compiler + break; + } + } +} + +/** + * ad_rx_machine - handle a port's rx State Machine + * @lacpdu: the lacpdu we've received + * @port: the port we're looking at + * + * If lacpdu arrived, stop previous timer (if exists) and set the next state as + * CURRENT. If timer expired set the state machine in the proper state. + * In other cases, this function checks if we need to switch to other state. + */ +static void ad_rx_machine(struct lacpdu *lacpdu, struct port *port) +{ + rx_states_t last_state; + + // Lock to prevent 2 instances of this function to run simultaneously(rx interrupt and periodic machine callback) + __get_rx_machine_lock(port); + + // keep current State Machine state to compare later if it was changed + last_state = port->sm_rx_state; + + // check if state machine should change state + // first, check if port was reinitialized + if (port->sm_vars & AD_PORT_BEGIN) { + port->sm_rx_state = AD_RX_INITIALIZE; // next state + } + // check if port is not enabled + else if (!(port->sm_vars & AD_PORT_BEGIN) && !port->is_enabled && !(port->sm_vars & AD_PORT_MOVED)) { + port->sm_rx_state = AD_RX_PORT_DISABLED; // next state + } + // check if new lacpdu arrived + else if (lacpdu && ((port->sm_rx_state == AD_RX_EXPIRED) || (port->sm_rx_state == AD_RX_DEFAULTED) || (port->sm_rx_state == AD_RX_CURRENT))) { + port->sm_rx_timer_counter = 0; // zero timer + port->sm_rx_state = AD_RX_CURRENT; + } else { + // if timer is on, and if it is expired + if (port->sm_rx_timer_counter && !(--port->sm_rx_timer_counter)) { + switch (port->sm_rx_state) { + case AD_RX_EXPIRED: + port->sm_rx_state = AD_RX_DEFAULTED; // next state + break; + case AD_RX_CURRENT: + port->sm_rx_state = AD_RX_EXPIRED; // next state + break; + default: //to silence the compiler + break; + } + } else { + // if no lacpdu arrived and no timer is on + switch (port->sm_rx_state) { + case AD_RX_PORT_DISABLED: + if (port->sm_vars & AD_PORT_MOVED) { + port->sm_rx_state = AD_RX_INITIALIZE; // next state + } else if (port->is_enabled && (port->sm_vars & AD_PORT_LACP_ENABLED)) { + port->sm_rx_state = AD_RX_EXPIRED; // next state + } else if (port->is_enabled && ((port->sm_vars & AD_PORT_LACP_ENABLED) == 0)) { + port->sm_rx_state = AD_RX_LACP_DISABLED; // next state + } + break; + default: //to silence the compiler + break; + + } + } + } + + // check if the State machine was changed or new lacpdu arrived + if ((port->sm_rx_state != last_state) || (lacpdu)) { + BOND_PRINT_DBG(("Rx Machine: Port=%d, Last State=%d, Curr State=%d", port->actor_port_number, last_state, port->sm_rx_state)); + switch (port->sm_rx_state) { + case AD_RX_INITIALIZE: + if (!(port->actor_oper_port_key & AD_DUPLEX_KEY_BITS)) { + port->sm_vars &= ~AD_PORT_LACP_ENABLED; + } else { + port->sm_vars |= AD_PORT_LACP_ENABLED; + } + port->sm_vars &= ~AD_PORT_SELECTED; + __record_default(port); + port->actor_oper_port_state &= ~AD_STATE_EXPIRED; + port->sm_vars &= ~AD_PORT_MOVED; + port->sm_rx_state = AD_RX_PORT_DISABLED; // next state + + /*- Fall Through -*/ + + case AD_RX_PORT_DISABLED: + port->sm_vars &= ~AD_PORT_MATCHED; + break; + case AD_RX_LACP_DISABLED: + port->sm_vars &= ~AD_PORT_SELECTED; + __record_default(port); + port->partner_oper_port_state &= ~AD_STATE_AGGREGATION; + port->sm_vars |= AD_PORT_MATCHED; + port->actor_oper_port_state &= ~AD_STATE_EXPIRED; + break; + case AD_RX_EXPIRED: + //Reset of the Synchronization flag. (Standard 43.4.12) + //This reset cause to disable this port in the COLLECTING_DISTRIBUTING state of the + //mux machine in case of EXPIRED even if LINK_DOWN didn't arrive for the port. + port->partner_oper_port_state &= ~AD_STATE_SYNCHRONIZATION; + port->sm_vars &= ~AD_PORT_MATCHED; + port->partner_oper_port_state |= AD_SHORT_TIMEOUT; + port->sm_rx_timer_counter = __ad_timer_to_ticks(AD_CURRENT_WHILE_TIMER, (u16)(AD_SHORT_TIMEOUT)); + port->actor_oper_port_state |= AD_STATE_EXPIRED; + break; + case AD_RX_DEFAULTED: + __update_default_selected(port); + __record_default(port); + port->sm_vars |= AD_PORT_MATCHED; + port->actor_oper_port_state &= ~AD_STATE_EXPIRED; + break; + case AD_RX_CURRENT: + // detect loopback situation + if (!MAC_ADDRESS_COMPARE(&(lacpdu->actor_system), &(port->actor_system))) { + // INFO_RECEIVED_LOOPBACK_FRAMES + printk(KERN_ERR "bonding: An illegal loopback occurred on adapter (%s)\n", + port->slave->dev->name); + printk(KERN_ERR "Check the configuration to verify that all Adapters " + "are connected to 802.3ad compliant switch ports\n"); + __release_rx_machine_lock(port); + return; + } + __update_selected(lacpdu, port); + __update_ntt(lacpdu, port); + __record_pdu(lacpdu, port); + __choose_matched(lacpdu, port); + port->sm_rx_timer_counter = __ad_timer_to_ticks(AD_CURRENT_WHILE_TIMER, (u16)(port->actor_oper_port_state & AD_STATE_LACP_TIMEOUT)); + port->actor_oper_port_state &= ~AD_STATE_EXPIRED; + // verify that if the aggregator is enabled, the port is enabled too. + //(because if the link goes down for a short time, the 802.3ad will not + // catch it, and the port will continue to be disabled) + if (port->aggregator && port->aggregator->is_active && !__port_is_enabled(port)) { + __enable_port(port); + } + break; + default: //to silence the compiler + break; + } + } + __release_rx_machine_lock(port); +} + +/** + * ad_tx_machine - handle a port's tx state machine + * @port: the port we're looking at + * + */ +static void ad_tx_machine(struct port *port) +{ + struct lacpdu *lacpdu = &port->lacpdu; + + // check if tx timer expired, to verify that we do not send more than 3 packets per second + if (port->sm_tx_timer_counter && !(--port->sm_tx_timer_counter)) { + // check if there is something to send + if (port->ntt && (port->sm_vars & AD_PORT_LACP_ENABLED)) { + //update current actual Actor parameters + //lacpdu->subtype initialized + //lacpdu->version_number initialized + //lacpdu->tlv_type_actor_info initialized + //lacpdu->actor_information_length initialized + lacpdu->actor_system_priority = port->actor_system_priority; + lacpdu->actor_system = port->actor_system; + lacpdu->actor_key = port->actor_oper_port_key; + lacpdu->actor_port_priority = port->actor_port_priority; + lacpdu->actor_port = port->actor_port_number; + lacpdu->actor_state = port->actor_oper_port_state; + //lacpdu->reserved_3_1 initialized + //lacpdu->tlv_type_partner_info initialized + //lacpdu->partner_information_length initialized + lacpdu->partner_system_priority = port->partner_oper_system_priority; + lacpdu->partner_system = port->partner_oper_system; + lacpdu->partner_key = port->partner_oper_key; + lacpdu->partner_port_priority = port->partner_oper_port_priority; + lacpdu->partner_port = port->partner_oper_port_number; + lacpdu->partner_state = port->partner_oper_port_state; + //lacpdu->reserved_3_2 initialized + //lacpdu->tlv_type_collector_info initialized + //lacpdu->collector_information_length initialized + //collector_max_delay initialized + //reserved_12[12] initialized + //tlv_type_terminator initialized + //terminator_length initialized + //reserved_50[50] initialized + + // We need to convert all non u8 parameters to Big Endian for transmit + __ntohs_lacpdu(lacpdu); + // send the lacpdu + if (ad_lacpdu_send(port) >= 0) { + BOND_PRINT_DBG(("Sent LACPDU on port %d", port->actor_port_number)); + // mark ntt as false, so it will not be sent again until demanded + port->ntt = 0; + } + } + // restart tx timer(to verify that we will not exceed AD_MAX_TX_IN_SECOND + port->sm_tx_timer_counter=ad_ticks_per_sec/AD_MAX_TX_IN_SECOND; + } +} + +/** + * ad_periodic_machine - handle a port's periodic state machine + * @port: the port we're looking at + * + * Turn ntt flag on priodically to perform periodic transmission of lacpdu's. + */ +static void ad_periodic_machine(struct port *port) +{ + periodic_states_t last_state; + + // keep current state machine state to compare later if it was changed + last_state = port->sm_periodic_state; + + // check if port was reinitialized + if (((port->sm_vars & AD_PORT_BEGIN) || !(port->sm_vars & AD_PORT_LACP_ENABLED) || !port->is_enabled) || + (!(port->actor_oper_port_state & AD_STATE_LACP_ACTIVITY) && !(port->partner_oper_port_state & AD_STATE_LACP_ACTIVITY)) + ) { + port->sm_periodic_state = AD_NO_PERIODIC; // next state + } + // check if state machine should change state + else if (port->sm_periodic_timer_counter) { + // check if periodic state machine expired + if (!(--port->sm_periodic_timer_counter)) { + // if expired then do tx + port->sm_periodic_state = AD_PERIODIC_TX; // next state + } else { + // If not expired, check if there is some new timeout parameter from the partner state + switch (port->sm_periodic_state) { + case AD_FAST_PERIODIC: + if (!(port->partner_oper_port_state & AD_STATE_LACP_TIMEOUT)) { + port->sm_periodic_state = AD_SLOW_PERIODIC; // next state + } + break; + case AD_SLOW_PERIODIC: + if ((port->partner_oper_port_state & AD_STATE_LACP_TIMEOUT)) { + // stop current timer + port->sm_periodic_timer_counter = 0; + port->sm_periodic_state = AD_PERIODIC_TX; // next state + } + break; + default: //to silence the compiler + break; + } + } + } else { + switch (port->sm_periodic_state) { + case AD_NO_PERIODIC: + port->sm_periodic_state = AD_FAST_PERIODIC; // next state + break; + case AD_PERIODIC_TX: + if (!(port->partner_oper_port_state & AD_STATE_LACP_TIMEOUT)) { + port->sm_periodic_state = AD_SLOW_PERIODIC; // next state + } else { + port->sm_periodic_state = AD_FAST_PERIODIC; // next state + } + break; + default: //to silence the compiler + break; + } + } + + // check if the state machine was changed + if (port->sm_periodic_state != last_state) { + BOND_PRINT_DBG(("Periodic Machine: Port=%d, Last State=%d, Curr State=%d", port->actor_port_number, last_state, port->sm_periodic_state)); + switch (port->sm_periodic_state) { + case AD_NO_PERIODIC: + port->sm_periodic_timer_counter = 0; // zero timer + break; + case AD_FAST_PERIODIC: + port->sm_periodic_timer_counter = __ad_timer_to_ticks(AD_PERIODIC_TIMER, (u16)(AD_FAST_PERIODIC_TIME))-1; // decrement 1 tick we lost in the PERIODIC_TX cycle + break; + case AD_SLOW_PERIODIC: + port->sm_periodic_timer_counter = __ad_timer_to_ticks(AD_PERIODIC_TIMER, (u16)(AD_SLOW_PERIODIC_TIME))-1; // decrement 1 tick we lost in the PERIODIC_TX cycle + break; + case AD_PERIODIC_TX: + port->ntt = 1; + break; + default: //to silence the compiler + break; + } + } +} + +/** + * ad_port_selection_logic - select aggregation groups + * @port: the port we're looking at + * + * Select aggregation groups, and assign each port for it's aggregetor. The + * selection logic is called in the inititalization (after all the handshkes), + * and after every lacpdu receive (if selected is off). + */ +static void ad_port_selection_logic(struct port *port) +{ + struct aggregator *aggregator, *free_aggregator = NULL, *temp_aggregator; + struct port *last_port = NULL, *curr_port; + int found = 0; + + // if the port is already Selected, do nothing + if (port->sm_vars & AD_PORT_SELECTED) { + return; + } + + // if the port is connected to other aggregator, detach it + if (port->aggregator) { + // detach the port from its former aggregator + temp_aggregator=port->aggregator; + for (curr_port=temp_aggregator->lag_ports; curr_port; last_port=curr_port, curr_port=curr_port->next_port_in_aggregator) { + if (curr_port == port) { + temp_aggregator->num_of_ports--; + if (!last_port) {// if it is the first port attached to the aggregator + temp_aggregator->lag_ports=port->next_port_in_aggregator; + } else {// not the first port attached to the aggregator + last_port->next_port_in_aggregator=port->next_port_in_aggregator; + } + + // clear the port's relations to this aggregator + port->aggregator = NULL; + port->next_port_in_aggregator=NULL; + port->actor_port_aggregator_identifier=0; + + BOND_PRINT_DBG(("Port %d left LAG %d", port->actor_port_number, temp_aggregator->aggregator_identifier)); + // if the aggregator is empty, clear its parameters, and set it ready to be attached + if (!temp_aggregator->lag_ports) { + ad_clear_agg(temp_aggregator); + } + break; + } + } + if (!curr_port) { // meaning: the port was related to an aggregator but was not on the aggregator port list + printk(KERN_WARNING "bonding: Warning: Port %d (on %s) was " + "related to aggregator %d but was not on its port list\n", + port->actor_port_number, port->slave->dev->name, + port->aggregator->aggregator_identifier); + } + } + // search on all aggregators for a suitable aggregator for this port + for (aggregator = __get_first_agg(port); aggregator; + aggregator = __get_next_agg(aggregator)) { + + // keep a free aggregator for later use(if needed) + if (!aggregator->lag_ports) { + if (!free_aggregator) { + free_aggregator=aggregator; + } + continue; + } + // check if current aggregator suits us + if (((aggregator->actor_oper_aggregator_key == port->actor_oper_port_key) && // if all parameters match AND + !MAC_ADDRESS_COMPARE(&(aggregator->partner_system), &(port->partner_oper_system)) && + (aggregator->partner_system_priority == port->partner_oper_system_priority) && + (aggregator->partner_oper_aggregator_key == port->partner_oper_key) + ) && + ((MAC_ADDRESS_COMPARE(&(port->partner_oper_system), &(null_mac_addr)) && // partner answers + !aggregator->is_individual) // but is not individual OR + ) + ) { + // attach to the founded aggregator + port->aggregator = aggregator; + port->actor_port_aggregator_identifier=port->aggregator->aggregator_identifier; + port->next_port_in_aggregator=aggregator->lag_ports; + port->aggregator->num_of_ports++; + aggregator->lag_ports=port; + BOND_PRINT_DBG(("Port %d joined LAG %d(existing LAG)", port->actor_port_number, port->aggregator->aggregator_identifier)); + + // mark this port as selected + port->sm_vars |= AD_PORT_SELECTED; + found = 1; + break; + } + } + + // the port couldn't find an aggregator - attach it to a new aggregator + if (!found) { + if (free_aggregator) { + // assign port a new aggregator + port->aggregator = free_aggregator; + port->actor_port_aggregator_identifier=port->aggregator->aggregator_identifier; + + // update the new aggregator's parameters + // if port was responsed from the end-user + if (port->actor_oper_port_key & AD_DUPLEX_KEY_BITS) {// if port is full duplex + port->aggregator->is_individual = 0; + } else { + port->aggregator->is_individual = 1; + } + + port->aggregator->actor_admin_aggregator_key = port->actor_admin_port_key; + port->aggregator->actor_oper_aggregator_key = port->actor_oper_port_key; + port->aggregator->partner_system=port->partner_oper_system; + port->aggregator->partner_system_priority = port->partner_oper_system_priority; + port->aggregator->partner_oper_aggregator_key = port->partner_oper_key; + port->aggregator->receive_state = 1; + port->aggregator->transmit_state = 1; + port->aggregator->lag_ports = port; + port->aggregator->num_of_ports++; + + // mark this port as selected + port->sm_vars |= AD_PORT_SELECTED; + + BOND_PRINT_DBG(("Port %d joined LAG %d(new LAG)", port->actor_port_number, port->aggregator->aggregator_identifier)); + } else { + printk(KERN_ERR "bonding: Port %d (on %s) did not find a suitable aggregator\n", + port->actor_port_number, port->slave->dev->name); + } + } + // if all aggregator's ports are READY_N == TRUE, set ready=TRUE in all aggregator's ports + // else set ready=FALSE in all aggregator's ports + __set_agg_ports_ready(port->aggregator, __agg_ports_are_ready(port->aggregator)); + + if (!__check_agg_selection_timer(port) && (aggregator = __get_first_agg(port))) { + ad_agg_selection_logic(aggregator); + } +} + +/** + * ad_agg_selection_logic - select an aggregation group for a team + * @aggregator: the aggregator we're looking at + * + * It is assumed that only one aggregator may be selected for a team. + * The logic of this function is to select (at first time) the aggregator with + * the most ports attached to it, and to reselect the active aggregator only if + * the previous aggregator has no more ports related to it. + * + * FIXME: this function MUST be called with the first agg in the bond, or + * __get_active_agg() won't work correctly. This function should be better + * called with the bond itself, and retrieve the first agg from it. + */ +static void ad_agg_selection_logic(struct aggregator *aggregator) +{ + struct aggregator *best_aggregator = NULL, *active_aggregator = NULL; + struct aggregator *last_active_aggregator = NULL, *origin_aggregator; + struct port *port; + u16 num_of_aggs=0; + + origin_aggregator = aggregator; + + //get current active aggregator + last_active_aggregator = __get_active_agg(aggregator); + + // search for the aggregator with the most ports attached to it. + do { + // count how many candidate lag's we have + if (aggregator->lag_ports) { + num_of_aggs++; + } + if (aggregator->is_active && !aggregator->is_individual && // if current aggregator is the active aggregator + MAC_ADDRESS_COMPARE(&(aggregator->partner_system), &(null_mac_addr))) { // and partner answers to 802.3ad PDUs + if (aggregator->num_of_ports) { // if any ports attached to the current aggregator + best_aggregator=NULL; // disregard the best aggregator that was chosen by now + break; // stop the selection of other aggregator if there are any ports attached to this active aggregator + } else { // no ports attached to this active aggregator + aggregator->is_active = 0; // mark this aggregator as not active anymore + } + } + if (aggregator->num_of_ports) { // if any ports attached + if (best_aggregator) { // if there is a candidte aggregator + //The reasons for choosing new best aggregator: + // 1. if current agg is NOT individual and the best agg chosen so far is individual OR + // current and best aggs are both individual or both not individual, AND + // 2a. current agg partner reply but best agg partner do not reply OR + // 2b. current agg partner reply OR current agg partner do not reply AND best agg partner also do not reply AND + // current has more ports/bandwidth, or same amount of ports but current has faster ports, THEN + // current agg become best agg so far + + //if current agg is NOT individual and the best agg chosen so far is individual change best_aggregator + if (!aggregator->is_individual && best_aggregator->is_individual) { + best_aggregator=aggregator; + } + // current and best aggs are both individual or both not individual + else if ((aggregator->is_individual && best_aggregator->is_individual) || + (!aggregator->is_individual && !best_aggregator->is_individual)) { + // current and best aggs are both individual or both not individual AND + // current agg partner reply but best agg partner do not reply + if ((MAC_ADDRESS_COMPARE(&(aggregator->partner_system), &(null_mac_addr)) && + !MAC_ADDRESS_COMPARE(&(best_aggregator->partner_system), &(null_mac_addr)))) { + best_aggregator=aggregator; + } + // current agg partner reply OR current agg partner do not reply AND best agg partner also do not reply + else if (! (!MAC_ADDRESS_COMPARE(&(aggregator->partner_system), &(null_mac_addr)) && + MAC_ADDRESS_COMPARE(&(best_aggregator->partner_system), &(null_mac_addr)))) { + if ((__get_agg_selection_mode(aggregator->lag_ports) == AD_BANDWIDTH)&& + (__get_agg_bandwidth(aggregator) > __get_agg_bandwidth(best_aggregator))) { + best_aggregator=aggregator; + } else if (__get_agg_selection_mode(aggregator->lag_ports) == AD_COUNT) { + if (((aggregator->num_of_ports > best_aggregator->num_of_ports) && + (aggregator->actor_oper_aggregator_key & AD_SPEED_KEY_BITS))|| + ((aggregator->num_of_ports == best_aggregator->num_of_ports) && + ((u16)(aggregator->actor_oper_aggregator_key & AD_SPEED_KEY_BITS) > + (u16)(best_aggregator->actor_oper_aggregator_key & AD_SPEED_KEY_BITS)))) { + best_aggregator=aggregator; + } + } + } + } + } else { + best_aggregator=aggregator; + } + } + aggregator->is_active = 0; // mark all aggregators as not active anymore + } while ((aggregator = __get_next_agg(aggregator))); + + // if we have new aggregator selected, don't replace the old aggregator if it has an answering partner, + // or if both old aggregator and new aggregator don't have answering partner + if (best_aggregator) { + if (last_active_aggregator && last_active_aggregator->lag_ports && last_active_aggregator->lag_ports->is_enabled && + (MAC_ADDRESS_COMPARE(&(last_active_aggregator->partner_system), &(null_mac_addr)) || // partner answers OR + (!MAC_ADDRESS_COMPARE(&(last_active_aggregator->partner_system), &(null_mac_addr)) && // both old and new + !MAC_ADDRESS_COMPARE(&(best_aggregator->partner_system), &(null_mac_addr)))) // partner do not answer + ) { + // if new aggregator has link, and old aggregator does not, replace old aggregator.(do nothing) + // -> don't replace otherwise. + if (!(!last_active_aggregator->actor_oper_aggregator_key && best_aggregator->actor_oper_aggregator_key)) { + best_aggregator=NULL; + last_active_aggregator->is_active = 1; // don't replace good old aggregator + + } + } + } + + // if there is new best aggregator, activate it + if (best_aggregator) { + for (aggregator = __get_first_agg(best_aggregator->lag_ports); + aggregator; + aggregator = __get_next_agg(aggregator)) { + + BOND_PRINT_DBG(("Agg=%d; Ports=%d; a key=%d; p key=%d; Indiv=%d; Active=%d", + aggregator->aggregator_identifier, aggregator->num_of_ports, + aggregator->actor_oper_aggregator_key, aggregator->partner_oper_aggregator_key, + aggregator->is_individual, aggregator->is_active)); + } + + // check if any partner replys + if (best_aggregator->is_individual) { + printk(KERN_WARNING "bonding: Warning: No 802.3ad response from the link partner " + "for any adapters in the bond\n"); + } + + // check if there are more than one aggregator + if (num_of_aggs > 1) { + BOND_PRINT_DBG(("Warning: More than one Link Aggregation Group was " + "found in the bond. Only one group will function in the bond")); + } + + best_aggregator->is_active = 1; + BOND_PRINT_DBG(("LAG %d choosed as the active LAG", best_aggregator->aggregator_identifier)); + BOND_PRINT_DBG(("Agg=%d; Ports=%d; a key=%d; p key=%d; Indiv=%d; Active=%d", + best_aggregator->aggregator_identifier, best_aggregator->num_of_ports, + best_aggregator->actor_oper_aggregator_key, best_aggregator->partner_oper_aggregator_key, + best_aggregator->is_individual, best_aggregator->is_active)); + + // disable the ports that were related to the former active_aggregator + if (last_active_aggregator) { + for (port=last_active_aggregator->lag_ports; port; port=port->next_port_in_aggregator) { + __disable_port(port); + } + } + } + + // if the selected aggregator is of join individuals(partner_system is NULL), enable their ports + active_aggregator = __get_active_agg(origin_aggregator); + + if (active_aggregator) { + if (!MAC_ADDRESS_COMPARE(&(active_aggregator->partner_system), &(null_mac_addr))) { + for (port=active_aggregator->lag_ports; port; port=port->next_port_in_aggregator) { + __enable_port(port); + } + } + } +} + +/** + * ad_clear_agg - clear a given aggregator's parameters + * @aggregator: the aggregator we're looking at + * + */ +static void ad_clear_agg(struct aggregator *aggregator) +{ + if (aggregator) { + aggregator->is_individual = 0; + aggregator->actor_admin_aggregator_key = 0; + aggregator->actor_oper_aggregator_key = 0; + aggregator->partner_system = null_mac_addr; + aggregator->partner_system_priority = 0; + aggregator->partner_oper_aggregator_key = 0; + aggregator->receive_state = 0; + aggregator->transmit_state = 0; + aggregator->lag_ports = NULL; + aggregator->is_active = 0; + aggregator->num_of_ports = 0; + BOND_PRINT_DBG(("LAG %d was cleared", aggregator->aggregator_identifier)); + } +} + +/** + * ad_initialize_agg - initialize a given aggregator's parameters + * @aggregator: the aggregator we're looking at + * + */ +static void ad_initialize_agg(struct aggregator *aggregator) +{ + if (aggregator) { + ad_clear_agg(aggregator); + + aggregator->aggregator_mac_address = null_mac_addr; + aggregator->aggregator_identifier = 0; + aggregator->slave = NULL; + } +} + +/** + * ad_initialize_port - initialize a given port's parameters + * @aggregator: the aggregator we're looking at + * + */ +static void ad_initialize_port(struct port *port) +{ + if (port) { + port->actor_port_number = 1; + port->actor_port_priority = 0xff; + port->actor_system = null_mac_addr; + port->actor_system_priority = 0xffff; + port->actor_port_aggregator_identifier = 0; + port->ntt = 0; + port->actor_admin_port_key = 1; + port->actor_oper_port_key = 1; + port->actor_admin_port_state = AD_STATE_AGGREGATION | AD_STATE_LACP_ACTIVITY; + port->actor_oper_port_state = AD_STATE_AGGREGATION | AD_STATE_LACP_ACTIVITY; + port->partner_admin_system = null_mac_addr; + port->partner_oper_system = null_mac_addr; + port->partner_admin_system_priority = 0xffff; + port->partner_oper_system_priority = 0xffff; + port->partner_admin_key = 1; + port->partner_oper_key = 1; + port->partner_admin_port_number = 1; + port->partner_oper_port_number = 1; + port->partner_admin_port_priority = 0xff; + port->partner_oper_port_priority = 0xff; + port->partner_admin_port_state = 1; + port->partner_oper_port_state = 1; + port->is_enabled = 1; + // ****** private parameters ****** + port->sm_vars = 0x3; + port->sm_rx_state = 0; + port->sm_rx_timer_counter = 0; + port->sm_periodic_state = 0; + port->sm_periodic_timer_counter = 0; + port->sm_mux_state = 0; + port->sm_mux_timer_counter = 0; + port->sm_tx_state = 0; + port->sm_tx_timer_counter = 0; + port->slave = NULL; + port->aggregator = NULL; + port->next_port_in_aggregator = NULL; + port->transaction_id = 0; + + ad_initialize_lacpdu(&(port->lacpdu)); + } +} + +/** + * ad_enable_collecting_distributing - enable a port's transmit/receive + * @port: the port we're looking at + * + * Enable @port if it's in an active aggregator + */ +static void ad_enable_collecting_distributing(struct port *port) +{ + if (port->aggregator->is_active) { + BOND_PRINT_DBG(("Enabling port %d(LAG %d)", port->actor_port_number, port->aggregator->aggregator_identifier)); + __enable_port(port); + } +} + +/** + * ad_disable_collecting_distributing - disable a port's transmit/receive + * @port: the port we're looking at + * + */ +static void ad_disable_collecting_distributing(struct port *port) +{ + if (port->aggregator && MAC_ADDRESS_COMPARE(&(port->aggregator->partner_system), &(null_mac_addr))) { + BOND_PRINT_DBG(("Disabling port %d(LAG %d)", port->actor_port_number, port->aggregator->aggregator_identifier)); + __disable_port(port); + } +} + +#if 0 +/** + * ad_marker_info_send - send a marker information frame + * @port: the port we're looking at + * + * This function does nothing since we decided not to implement send and handle + * response for marker PDU's, in this stage, but only to respond to marker + * information. + */ +static void ad_marker_info_send(struct port *port) +{ + struct marker marker; + u16 index; + + // fill the marker PDU with the appropriate values + marker.subtype = 0x02; + marker.version_number = 0x01; + marker.tlv_type = AD_MARKER_INFORMATION_SUBTYPE; + marker.marker_length = 0x16; + // convert requester_port to Big Endian + marker.requester_port = (((port->actor_port_number & 0xFF) << 8) |((u16)(port->actor_port_number & 0xFF00) >> 8)); + marker.requester_system = port->actor_system; + // convert requester_port(u32) to Big Endian + marker.requester_transaction_id = (((++port->transaction_id & 0xFF) << 24) |((port->transaction_id & 0xFF00) << 8) |((port->transaction_id & 0xFF0000) >> 8) |((port->transaction_id & 0xFF000000) >> 24)); + marker.pad = 0; + marker.tlv_type_terminator = 0x00; + marker.terminator_length = 0x00; + for (index=0; index<90; index++) { + marker.reserved_90[index]=0; + } + + // send the marker information + if (ad_marker_send(port, &marker) >= 0) { + BOND_PRINT_DBG(("Sent Marker Information on port %d", port->actor_port_number)); + } +} +#endif + +/** + * ad_marker_info_received - handle receive of a Marker information frame + * @marker_info: Marker info received + * @port: the port we're looking at + * + */ +static void ad_marker_info_received(struct marker *marker_info,struct port *port) +{ + struct marker marker; + + // copy the received marker data to the response marker + //marker = *marker_info; + memcpy(&marker, marker_info, sizeof(struct marker)); + // change the marker subtype to marker response + marker.tlv_type=AD_MARKER_RESPONSE_SUBTYPE; + // send the marker response + + if (ad_marker_send(port, &marker) >= 0) { + BOND_PRINT_DBG(("Sent Marker Response on port %d", port->actor_port_number)); + } +} + +/** + * ad_marker_response_received - handle receive of a marker response frame + * @marker: marker PDU received + * @port: the port we're looking at + * + * This function does nothing since we decided not to implement send and handle + * response for marker PDU's, in this stage, but only to respond to marker + * information. + */ +static void ad_marker_response_received(struct marker *marker, struct port *port) +{ + marker=NULL; // just to satisfy the compiler + port=NULL; // just to satisfy the compiler + // DO NOTHING, SINCE WE DECIDED NOT TO IMPLEMENT THIS FEATURE FOR NOW +} + +/** + * ad_initialize_lacpdu - initialize a given lacpdu structure + * @lacpdu: lacpdu structure to initialize + * + */ +static void ad_initialize_lacpdu(struct lacpdu *lacpdu) +{ + u16 index; + + // initialize lacpdu data + lacpdu->subtype = 0x01; + lacpdu->version_number = 0x01; + lacpdu->tlv_type_actor_info = 0x01; + lacpdu->actor_information_length = 0x14; + // lacpdu->actor_system_priority updated on send + // lacpdu->actor_system updated on send + // lacpdu->actor_key updated on send + // lacpdu->actor_port_priority updated on send + // lacpdu->actor_port updated on send + // lacpdu->actor_state updated on send + lacpdu->tlv_type_partner_info = 0x02; + lacpdu->partner_information_length = 0x14; + for (index=0; index<=2; index++) { + lacpdu->reserved_3_1[index]=0; + } + // lacpdu->partner_system_priority updated on send + // lacpdu->partner_system updated on send + // lacpdu->partner_key updated on send + // lacpdu->partner_port_priority updated on send + // lacpdu->partner_port updated on send + // lacpdu->partner_state updated on send + for (index=0; index<=2; index++) { + lacpdu->reserved_3_2[index]=0; + } + lacpdu->tlv_type_collector_info = 0x03; + lacpdu->collector_information_length= 0x10; + lacpdu->collector_max_delay = AD_COLLECTOR_MAX_DELAY; + for (index=0; index<=11; index++) { + lacpdu->reserved_12[index]=0; + } + lacpdu->tlv_type_terminator = 0x00; + lacpdu->terminator_length = 0; + for (index=0; index<=49; index++) { + lacpdu->reserved_50[index]=0; + } +} + +////////////////////////////////////////////////////////////////////////////////////// +// ================= AD exported functions to the main bonding code ================== +////////////////////////////////////////////////////////////////////////////////////// + +// Check aggregators status in team every T seconds +#define AD_AGGREGATOR_SELECTION_TIMER 8 + +static u16 aggregator_identifier; + +/** + * bond_3ad_initialize - initialize a bond's 802.3ad parameters and structures + * @bond: bonding struct to work on + * @tick_resolution: tick duration (millisecond resolution) + * + * Can be called only after the mac address of the bond is set. + */ +void bond_3ad_initialize(struct bonding *bond, u16 tick_resolution) +{ + // check that the bond is not initialized yet + if (MAC_ADDRESS_COMPARE(&(BOND_AD_INFO(bond).system.sys_mac_addr), &(bond->device->dev_addr))) { + + aggregator_identifier = 0; + + BOND_AD_INFO(bond).system.sys_priority = 0xFFFF; + BOND_AD_INFO(bond).system.sys_mac_addr = *((struct mac_addr *)bond->device->dev_addr); + + // initialize how many times this module is called in one second(should be about every 100ms) + ad_ticks_per_sec = tick_resolution; + + // initialize the aggregator selection timer(to activate an aggregation selection after initialize) + BOND_AD_INFO(bond).agg_select_timer = (AD_AGGREGATOR_SELECTION_TIMER * ad_ticks_per_sec); + BOND_AD_INFO(bond).agg_select_mode = AD_BANDWIDTH; + } +} + +/** + * bond_3ad_bind_slave - initialize a slave's port + * @slave: slave struct to work on + * + * Returns: 0 on success + * < 0 on error + */ +int bond_3ad_bind_slave(struct slave *slave) +{ + struct bonding *bond = bond_get_bond_by_slave(slave); + struct port *port; + struct aggregator *aggregator; + + if (bond == NULL) { + printk(KERN_CRIT "The slave %s is not attached to its bond\n", slave->dev->name); + return -1; + } + + //check that the slave has not been intialized yet. + if (SLAVE_AD_INFO(slave).port.slave != slave) { + + // port initialization + port = &(SLAVE_AD_INFO(slave).port); + + ad_initialize_port(port); + + port->slave = slave; + port->actor_port_number = SLAVE_AD_INFO(slave).id; + // key is determined according to the link speed, duplex and user key(which is yet not supported) + // ------------------------------------------------------------ + // Port key : | User key | Speed |Duplex| + // ------------------------------------------------------------ + // 16 6 1 0 + port->actor_admin_port_key = 0; // initialize this parameter + port->actor_admin_port_key |= __get_duplex(port); + port->actor_admin_port_key |= (__get_link_speed(port) << 1); + port->actor_oper_port_key = port->actor_admin_port_key; + // if the port is not full duplex, then the port should be not lacp Enabled + if (!(port->actor_oper_port_key & AD_DUPLEX_KEY_BITS)) { + port->sm_vars &= ~AD_PORT_LACP_ENABLED; + } + // actor system is the bond's system + port->actor_system = BOND_AD_INFO(bond).system.sys_mac_addr; + // tx timer(to verify that no more than MAX_TX_IN_SECOND lacpdu's are sent in one second) + port->sm_tx_timer_counter = ad_ticks_per_sec/AD_MAX_TX_IN_SECOND; + port->aggregator = NULL; + port->next_port_in_aggregator = NULL; + + __disable_port(port); + __initialize_port_locks(port); + + + // aggregator initialization + aggregator = &(SLAVE_AD_INFO(slave).aggregator); + + ad_initialize_agg(aggregator); + + aggregator->aggregator_mac_address = *((struct mac_addr *)bond->device->dev_addr); + aggregator->aggregator_identifier = (++aggregator_identifier); + aggregator->slave = slave; + aggregator->is_active = 0; + aggregator->num_of_ports = 0; + } + + return 0; +} + +/** + * bond_3ad_unbind_slave - deinitialize a slave's port + * @slave: slave struct to work on + * + * Search for the aggregator that is related to this port, remove the + * aggregator and assign another aggregator for other port related to it + * (if any), and remove the port. + */ +void bond_3ad_unbind_slave(struct slave *slave) +{ + struct port *port, *prev_port, *temp_port; + struct aggregator *aggregator, *new_aggregator, *temp_aggregator; + int select_new_active_agg = 0; + + // find the aggregator related to this slave + aggregator = &(SLAVE_AD_INFO(slave).aggregator); + + // find the port related to this slave + port = &(SLAVE_AD_INFO(slave).port); + + // if slave is null, the whole port is not initialized + if (!port->slave) { + printk(KERN_WARNING "bonding: Trying to unbind an uninitialized port on %s\n", slave->dev->name); + return; + } + + bond_3ad_link_status_changed(slave, 0); + + // disable the port + ad_disable_collecting_distributing(port); + + // deinitialize port's locks if necessary(os-specific) + __deinitialize_port_locks(port); + + BOND_PRINT_DBG(("Unbinding Link Aggregation Group %d", aggregator->aggregator_identifier)); + // check if this aggregator is occupied + if (aggregator->lag_ports) { + // check if there are other ports related to this aggregator except + // the port related to this slave(thats ensure us that there is a + // reason to search for new aggregator, and that we will find one + if ((aggregator->lag_ports != port) || (aggregator->lag_ports->next_port_in_aggregator)) { + // find new aggregator for the related port(s) + new_aggregator = __get_first_agg(port); + for (; new_aggregator; new_aggregator = __get_next_agg(new_aggregator)) { + // if the new aggregator is empty, or it connected to to our port only + if (!new_aggregator->lag_ports || ((new_aggregator->lag_ports == port) && !new_aggregator->lag_ports->next_port_in_aggregator)) { + break; + } + } + // if new aggregator found, copy the aggregator's parameters + // and connect the related lag_ports to the new aggregator + if ((new_aggregator) && ((!new_aggregator->lag_ports) || ((new_aggregator->lag_ports == port) && !new_aggregator->lag_ports->next_port_in_aggregator))) { + BOND_PRINT_DBG(("Some port(s) related to LAG %d - replaceing with LAG %d", aggregator->aggregator_identifier, new_aggregator->aggregator_identifier)); + + if ((new_aggregator->lag_ports == port) && new_aggregator->is_active) { + printk(KERN_INFO "bonding: Removing an active aggregator\n"); + // select new active aggregator + select_new_active_agg = 1; + } + + new_aggregator->is_individual = aggregator->is_individual; + new_aggregator->actor_admin_aggregator_key = aggregator->actor_admin_aggregator_key; + new_aggregator->actor_oper_aggregator_key = aggregator->actor_oper_aggregator_key; + new_aggregator->partner_system = aggregator->partner_system; + new_aggregator->partner_system_priority = aggregator->partner_system_priority; + new_aggregator->partner_oper_aggregator_key = aggregator->partner_oper_aggregator_key; + new_aggregator->receive_state = aggregator->receive_state; + new_aggregator->transmit_state = aggregator->transmit_state; + new_aggregator->lag_ports = aggregator->lag_ports; + new_aggregator->is_active = aggregator->is_active; + new_aggregator->num_of_ports = aggregator->num_of_ports; + + // update the information that is written on the ports about the aggregator + for (temp_port=aggregator->lag_ports; temp_port; temp_port=temp_port->next_port_in_aggregator) { + temp_port->aggregator=new_aggregator; + temp_port->actor_port_aggregator_identifier = new_aggregator->aggregator_identifier; + } + + // clear the aggregator + ad_clear_agg(aggregator); + + if (select_new_active_agg) { + ad_agg_selection_logic(__get_first_agg(port)); + } + } else { + printk(KERN_WARNING "bonding: Warning: unbinding aggregator, " + "and could not find a new aggregator for its ports\n"); + } + } else { // in case that the only port related to this aggregator is the one we want to remove + select_new_active_agg = aggregator->is_active; + // clear the aggregator + ad_clear_agg(aggregator); + if (select_new_active_agg) { + printk(KERN_INFO "Removing an active aggregator\n"); + // select new active aggregator + ad_agg_selection_logic(__get_first_agg(port)); + } + } + } + + BOND_PRINT_DBG(("Unbinding port %d", port->actor_port_number)); + // find the aggregator that this port is connected to + temp_aggregator = __get_first_agg(port); + for (; temp_aggregator; temp_aggregator = __get_next_agg(temp_aggregator)) { + prev_port = NULL; + // search the port in the aggregator's related ports + for (temp_port=temp_aggregator->lag_ports; temp_port; prev_port=temp_port, temp_port=temp_port->next_port_in_aggregator) { + if (temp_port == port) { // the aggregator found - detach the port from this aggregator + if (prev_port) { + prev_port->next_port_in_aggregator = temp_port->next_port_in_aggregator; + } else { + temp_aggregator->lag_ports = temp_port->next_port_in_aggregator; + } + temp_aggregator->num_of_ports--; + if (temp_aggregator->num_of_ports==0) { + select_new_active_agg = temp_aggregator->is_active; + // clear the aggregator + ad_clear_agg(temp_aggregator); + if (select_new_active_agg) { + printk(KERN_INFO "Removing an active aggregator\n"); + // select new active aggregator + ad_agg_selection_logic(__get_first_agg(port)); + } + } + break; + } + } + } + port->slave=NULL; +} + +/** + * bond_3ad_state_machine_handler - handle state machines timeout + * @bond: bonding struct to work on + * + * The state machine handling concept in this module is to check every tick + * which state machine should operate any function. The execution order is + * round robin, so when we have an interaction between state machines, the + * reply of one to each other might be delayed until next tick. + * + * This function also complete the initialization when the agg_select_timer + * times out, and it selects an aggregator for the ports that are yet not + * related to any aggregator, and selects the active aggregator for a bond. + */ +void bond_3ad_state_machine_handler(struct bonding *bond) +{ + struct port *port; + struct aggregator *aggregator; + unsigned long flags; + + read_lock_irqsave(&bond->lock, flags); + + //check if there are any slaves + if (bond->next == (struct slave *)bond) { + goto end; + } + + if ((bond->device->flags & IFF_UP) != IFF_UP) { + goto end; + } + + // check if agg_select_timer timer after initialize is timed out + if (BOND_AD_INFO(bond).agg_select_timer && !(--BOND_AD_INFO(bond).agg_select_timer)) { + // select the active aggregator for the bond + if ((port = __get_first_port(bond))) { + if (!port->slave) { + printk(KERN_WARNING "bonding: Warning: bond's first port is uninitialized\n"); + goto end; + } + + aggregator = __get_first_agg(port); + ad_agg_selection_logic(aggregator); + } + } + + // for each port run the state machines + for (port = __get_first_port(bond); port; port = __get_next_port(port)) { + if (!port->slave) { + printk(KERN_WARNING "bonding: Warning: Found an uninitialized port\n"); + goto end; + } + + ad_rx_machine(NULL, port); + ad_periodic_machine(port); + ad_port_selection_logic(port); + ad_mux_machine(port); + ad_tx_machine(port); + + // turn off the BEGIN bit, since we already handled it + if (port->sm_vars & AD_PORT_BEGIN) { + port->sm_vars &= ~AD_PORT_BEGIN; + } + } + +end: + read_unlock_irqrestore(&bond->lock, flags); + + + if ((bond->device->flags & IFF_UP) == IFF_UP) { + /* re-arm the timer */ + mod_timer(&(BOND_AD_INFO(bond).ad_timer), jiffies + (AD_TIMER_INTERVAL * HZ / 1000)); + } +} + +/** + * bond_3ad_rx_indication - handle a received frame + * @lacpdu: received lacpdu + * @slave: slave struct to work on + * @length: length of the data received + * + * It is assumed that frames that were sent on this NIC don't returned as new + * received frames (loopback). Since only the payload is given to this + * function, it check for loopback. + */ +void bond_3ad_rx_indication(struct lacpdu *lacpdu, struct slave *slave, u16 length) +{ + struct port *port; + + if (length >= sizeof(struct lacpdu)) { + + port = &(SLAVE_AD_INFO(slave).port); + + if (!port->slave) { + printk(KERN_WARNING "bonding: Warning: port of slave %s is uninitialized\n", slave->dev->name); + return; + } + + switch (lacpdu->subtype) { + case AD_TYPE_LACPDU: + __ntohs_lacpdu(lacpdu); + BOND_PRINT_DBG(("Received LACPDU on port %d", port->actor_port_number)); + ad_rx_machine(lacpdu, port); + break; + + case AD_TYPE_MARKER: + // No need to convert fields to Little Endian since we don't use the marker's fields. + + switch (((struct marker *)lacpdu)->tlv_type) { + case AD_MARKER_INFORMATION_SUBTYPE: + BOND_PRINT_DBG(("Received Marker Information on port %d", port->actor_port_number)); + ad_marker_info_received((struct marker *)lacpdu, port); + break; + + case AD_MARKER_RESPONSE_SUBTYPE: + BOND_PRINT_DBG(("Received Marker Response on port %d", port->actor_port_number)); + ad_marker_response_received((struct marker *)lacpdu, port); + break; + + default: + BOND_PRINT_DBG(("Received an unknown Marker subtype on slot %d", port->actor_port_number)); + } + } + } +} + +/** + * bond_3ad_adapter_speed_changed - handle a slave's speed change indication + * @slave: slave struct to work on + * + * Handle reselection of aggregator (if needed) for this port. + */ +void bond_3ad_adapter_speed_changed(struct slave *slave) +{ + struct port *port; + + port = &(SLAVE_AD_INFO(slave).port); + + // if slave is null, the whole port is not initialized + if (!port->slave) { + printk(KERN_WARNING "bonding: Warning: speed changed for uninitialized port on %s\n", + slave->dev->name); + return; + } + + port->actor_admin_port_key &= ~AD_SPEED_KEY_BITS; + port->actor_oper_port_key=port->actor_admin_port_key |= (__get_link_speed(port) << 1); + BOND_PRINT_DBG(("Port %d changed speed", port->actor_port_number)); + // there is no need to reselect a new aggregator, just signal the + // state machines to reinitialize + port->sm_vars |= AD_PORT_BEGIN; +} + +/** + * bond_3ad_adapter_duplex_changed - handle a slave's duplex change indication + * @slave: slave struct to work on + * + * Handle reselection of aggregator (if needed) for this port. + */ +void bond_3ad_adapter_duplex_changed(struct slave *slave) +{ + struct port *port; + + port=&(SLAVE_AD_INFO(slave).port); + + // if slave is null, the whole port is not initialized + if (!port->slave) { + printk(KERN_WARNING "bonding: Warning: duplex changed for uninitialized port on %s\n", + slave->dev->name); + return; + } + + port->actor_admin_port_key &= ~AD_DUPLEX_KEY_BITS; + port->actor_oper_port_key=port->actor_admin_port_key |= __get_duplex(port); + BOND_PRINT_DBG(("Port %d changed duplex", port->actor_port_number)); + // there is no need to reselect a new aggregator, just signal the + // state machines to reinitialize + port->sm_vars |= AD_PORT_BEGIN; +} + +/** + * bond_3ad_link_status_changed - handle a slave's link status change indication + * @slave: slave struct to work on + * @status: whether the link is now up or down + * + * Handle reselection of aggregator (if needed) for this port. + */ +void bond_3ad_link_status_changed(struct slave *slave, int status) +{ + struct port *port; + + port = &(SLAVE_AD_INFO(slave).port); + + // if slave is null, the whole port is not initialized + if (!port->slave) { + printk(KERN_WARNING "bonding: Warning: link status changed for uninitialized port on %s\n", + slave->dev->name); + return; + } + + // on link down we are zeroing duplex and speed since some of the adaptors(ce1000.lan) report full duplex/speed instead of N/A(duplex) / 0(speed) + // on link up we are forcing recheck on the duplex and speed since some of he adaptors(ce1000.lan) report + if (status) { // is up + port->is_enabled = 1; + port->actor_admin_port_key &= ~AD_DUPLEX_KEY_BITS; + port->actor_oper_port_key=port->actor_admin_port_key |= __get_duplex(port); + port->actor_admin_port_key &= ~AD_SPEED_KEY_BITS; + port->actor_oper_port_key=port->actor_admin_port_key |= (__get_link_speed(port) << 1); + } else { + port->is_enabled = 0; + port->actor_admin_port_key &= ~AD_DUPLEX_KEY_BITS; + port->actor_oper_port_key= (port->actor_admin_port_key &= ~AD_SPEED_KEY_BITS); + } + BOND_PRINT_DBG(("Port %d changed link status to %s", port->actor_port_number, (status?"UP":"DOWN"))); + // there is no need to reselect a new aggregator, just signal the + // state machines to reinitialize + port->sm_vars |= AD_PORT_BEGIN; +} + +/** + * bond_3ad_get_active_agg_info - get information of the active aggregator + * @bond: bonding struct to work on + * @ad_info: ad_info struct to fill with the bond's info + * + * Returns: 0 on success + * < 0 on error + */ +int bond_3ad_get_active_agg_info(struct bonding *bond, struct ad_info *ad_info) +{ + struct aggregator *aggregator = NULL; + struct port *port; + + for (port = __get_first_port(bond); port; port = __get_next_port(port)) { + if (port->aggregator && port->aggregator->is_active) { + aggregator = port->aggregator; + break; + } + } + + if (aggregator) { + ad_info->aggregator_id = aggregator->aggregator_identifier; + ad_info->ports = aggregator->num_of_ports; + ad_info->actor_key = aggregator->actor_oper_aggregator_key; + ad_info->partner_key = aggregator->partner_oper_aggregator_key; + memcpy(ad_info->partner_system, aggregator->partner_system.mac_addr_value, ETH_ALEN); + return 0; + } + + return -1; +} + +int bond_3ad_xmit_xor(struct sk_buff *skb, struct net_device *dev) +{ + slave_t *slave, *start_at; + struct bonding *bond = (struct bonding *) dev->priv; + unsigned long flags; + struct ethhdr *data = (struct ethhdr *)skb->data; + int slave_agg_no; + int slaves_in_agg; + int agg_id; + struct ad_info ad_info; + + if (!IS_UP(dev)) { /* bond down */ + dev_kfree_skb(skb); + return 0; + } + + if (bond == NULL) { + printk(KERN_CRIT "bonding: Error: bond is NULL on device %s\n", dev->name); + dev_kfree_skb(skb); + return 0; + } + + read_lock_irqsave(&bond->lock, flags); + slave = bond->prev; + + /* check if bond is empty */ + if ((slave == (struct slave *) bond) || (bond->slave_cnt == 0)) { + printk(KERN_DEBUG "ERROR: bond is empty\n"); + dev_kfree_skb(skb); + read_unlock_irqrestore(&bond->lock, flags); + return 0; + } + + if (bond_3ad_get_active_agg_info(bond, &ad_info)) { + printk(KERN_DEBUG "ERROR: bond_3ad_get_active_agg_info failed\n"); + dev_kfree_skb(skb); + read_unlock_irqrestore(&bond->lock, flags); + return 0; + } + + slaves_in_agg = ad_info.ports; + agg_id = ad_info.aggregator_id; + + if (slaves_in_agg == 0) { + /*the aggregator is empty*/ + printk(KERN_DEBUG "ERROR: active aggregator is empty\n"); + dev_kfree_skb(skb); + read_unlock_irqrestore(&bond->lock, flags); + return 0; + } + + /* we're at the root, get the first slave */ + if ((slave == NULL) || (slave->dev == NULL)) { + /* no suitable interface, frame not sent */ + dev_kfree_skb(skb); + read_unlock_irqrestore(&bond->lock, flags); + return 0; + } + + slave_agg_no = (data->h_dest[5]^slave->dev->dev_addr[5]) % slaves_in_agg; + while (slave != (slave_t *)bond) { + struct aggregator *agg = SLAVE_AD_INFO(slave).port.aggregator; + + if (agg && (agg->aggregator_identifier == agg_id)) { + slave_agg_no--; + if (slave_agg_no < 0) { + break; + } + } + + slave = slave->prev; + if (slave == NULL) { + printk(KERN_ERR "bonding: Error: slave is NULL\n"); + dev_kfree_skb(skb); + read_unlock_irqrestore(&bond->lock, flags); + return 0; + } + } + + if (slave == (slave_t *)bond) { + printk(KERN_ERR "bonding: Error: Couldn't find a slave to tx on for aggregator ID %d\n", agg_id); + dev_kfree_skb(skb); + read_unlock_irqrestore(&bond->lock, flags); + return 0; + } + + start_at = slave; + + do { + int slave_agg_id = 0; + struct aggregator *agg; + + if (slave == NULL) { + printk(KERN_ERR "bonding: Error: slave is NULL\n"); + dev_kfree_skb(skb); + read_unlock_irqrestore(&bond->lock, flags); + return 0; + } + + agg = SLAVE_AD_INFO(slave).port.aggregator; + + if (agg) { + slave_agg_id = agg->aggregator_identifier; + } + + if (SLAVE_IS_OK(slave) && + agg && (slave_agg_id == agg_id)) { + skb->dev = slave->dev; + skb->priority = 1; + dev_queue_xmit(skb); + read_unlock_irqrestore(&bond->lock, flags); + return 0; + } + } while ((slave = slave->next) != start_at); + + /* no suitable interface, frame not sent */ + dev_kfree_skb(skb); + read_unlock_irqrestore(&bond->lock, flags); + return 0; +} + +int bond_3ad_lacpdu_recv(struct sk_buff *skb, struct net_device *dev, struct packet_type* ptype) +{ + struct bonding *bond = (struct bonding *)dev->priv; + struct slave *slave = NULL; + unsigned long flags; + int ret = NET_RX_DROP; + + if (!(dev->flags & IFF_MASTER)) { + goto out; + } + + read_lock_irqsave(&bond->lock, flags); +#ifdef BOND_POINT_TO_POINT_PROT + slave = bond_get_slave_by_dev((struct bonding *) dev->priv, skb->real_dev); +#else +#warning "skb->real_dev not defined. apply bond-p2p patch for the module to work !!!" +#endif //BOND_POINT_TO_POINT_PROT + + if (slave == NULL) { + goto out_unlock; + } + + bond_3ad_rx_indication((struct lacpdu *) skb->data, slave, skb->len); + + ret = NET_RX_SUCCESS; + +out_unlock: + read_unlock_irqrestore(&bond->lock, flags); +out: + dev_kfree_skb(skb); + + return ret; +} + diff -Nuarp linux-2.4.20-bonding-20030317/drivers/net/bonding/bond_3ad.h linux-2.4.20-bonding-20030317-devel/drivers/net/bonding/bond_3ad.h --- linux-2.4.20-bonding-20030317/drivers/net/bonding/bond_3ad.h 1970-01-01 02:00:00.000000000 +0200 +++ linux-2.4.20-bonding-20030317-devel/drivers/net/bonding/bond_3ad.h 2003-03-18 17:24:25.000000000 +0200 @@ -0,0 +1,281 @@ +/**************************************************************************** + Copyright(c) 1999 - 2003 Intel Corporation. All rights reserved. + + This program is free software; you can redistribute it and/or modify it + under the terms of the GNU General Public License as published by the Free + Software Foundation; either version 2 of the License, or (at your option) + any later version. + + This program is distributed in the hope that it will be useful, but WITHOUT + ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or + FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for + more details. + + You should have received a copy of the GNU General Public License along with + this program; if not, write to the Free Software Foundation, Inc., 59 + Temple Place - Suite 330, Boston, MA 02111-1307, USA. + + The full GNU General Public License is included in this distribution in the + file called LICENSE. +*****************************************************************************/ + +#ifndef __BOND_3AD_H__ +#define __BOND_3AD_H__ + +#include +#include +#include + +// General definitions +#define BOND_ETH_P_LACPDU 0x8809 +#define PKT_TYPE_LACPDU __constant_htons(BOND_ETH_P_LACPDU) +#define AD_TIMER_INTERVAL 100 /*msec*/ + +#define MULTICAST_LACPDU_ADDR {0x01, 0x80, 0xC2, 0x00, 0x00, 0x02} +#define AD_MULTICAST_LACPDU_ADDR {MULTICAST_LACPDU_ADDR} + +typedef struct mac_addr { + u8 mac_addr_value[ETH_ALEN]; +} mac_addr_t; + +typedef enum { + AD_BANDWIDTH = 0, + AD_COUNT +} agg_selection_t; + +// rx machine states(43.4.11 in the 802.3ad standard) +typedef enum { + AD_RX_DUMMY, + AD_RX_INITIALIZE, // rx Machine + AD_RX_PORT_DISABLED, // rx Machine + AD_RX_LACP_DISABLED, // rx Machine + AD_RX_EXPIRED, // rx Machine + AD_RX_DEFAULTED, // rx Machine + AD_RX_CURRENT // rx Machine +} rx_states_t; + +// periodic machine states(43.4.12 in the 802.3ad standard) +typedef enum { + AD_PERIODIC_DUMMY, + AD_NO_PERIODIC, // periodic machine + AD_FAST_PERIODIC, // periodic machine + AD_SLOW_PERIODIC, // periodic machine + AD_PERIODIC_TX // periodic machine +} periodic_states_t; + +// mux machine states(43.4.13 in the 802.3ad standard) +typedef enum { + AD_MUX_DUMMY, + AD_MUX_DETACHED, // mux machine + AD_MUX_WAITING, // mux machine + AD_MUX_ATTACHED, // mux machine + AD_MUX_COLLECTING_DISTRIBUTING // mux machine +} mux_states_t; + +// tx machine states(43.4.15 in the 802.3ad standard) +typedef enum { + AD_TX_DUMMY, + AD_TRANSMIT // tx Machine +} tx_states_t; + +// rx indication types +typedef enum { + AD_TYPE_LACPDU = 1, // type lacpdu + AD_TYPE_MARKER // type marker +} pdu_type_t; + +// rx marker indication types +typedef enum { + AD_MARKER_INFORMATION_SUBTYPE = 1, // marker imformation subtype + AD_MARKER_RESPONSE_SUBTYPE // marker response subtype +} marker_subtype_t; + +// timers types(43.4.9 in the 802.3ad standard) +typedef enum { + AD_CURRENT_WHILE_TIMER, + AD_ACTOR_CHURN_TIMER, + AD_PERIODIC_TIMER, + AD_PARTNER_CHURN_TIMER, + AD_WAIT_WHILE_TIMER +} ad_timers_t; + +#pragma pack(1) + +typedef struct ad_header { + struct mac_addr destination_address; + struct mac_addr source_address; + u16 length_type; +} ad_header_t; + +// Link Aggregation Control Protocol(LACP) data unit structure(43.4.2.2 in the 802.3ad standard) +typedef struct lacpdu { + u8 subtype; // = LACP(= 0x01) + u8 version_number; + u8 tlv_type_actor_info; // = actor information(type/length/value) + u8 actor_information_length; // = 20 + u16 actor_system_priority; + struct mac_addr actor_system; + u16 actor_key; + u16 actor_port_priority; + u16 actor_port; + u8 actor_state; + u8 reserved_3_1[3]; // = 0 + u8 tlv_type_partner_info; // = partner information + u8 partner_information_length; // = 20 + u16 partner_system_priority; + struct mac_addr partner_system; + u16 partner_key; + u16 partner_port_priority; + u16 partner_port; + u8 partner_state; + u8 reserved_3_2[3]; // = 0 + u8 tlv_type_collector_info; // = collector information + u8 collector_information_length; // = 16 + u16 collector_max_delay; + u8 reserved_12[12]; + u8 tlv_type_terminator; // = terminator + u8 terminator_length; // = 0 + u8 reserved_50[50]; // = 0 +} lacpdu_t; + +typedef struct lacpdu_header { + struct ad_header ad_header; + struct lacpdu lacpdu; +} lacpdu_header_t; + +// Marker Protocol Data Unit(PDU) structure(43.5.3.2 in the 802.3ad standard) +typedef struct marker { + u8 subtype; // = 0x02 (marker PDU) + u8 version_number; // = 0x01 + u8 tlv_type; // = 0x01 (marker information) + // = 0x02 (marker response information) + u8 marker_length; // = 0x16 + u16 requester_port; // The number assigned to the port by the requester + struct mac_addr requester_system; // The requester’s system id + u32 requester_transaction_id; // The transaction id allocated by the requester, + u16 pad; // = 0 + u8 tlv_type_terminator; // = 0x00 + u8 terminator_length; // = 0x00 + u8 reserved_90[90]; // = 0 +} marker_t; + +typedef struct marker_header { + struct ad_header ad_header; + struct marker marker; +} marker_header_t; + +#pragma pack() + +struct slave; +struct bonding; +struct ad_info; +struct port; + +#ifdef __ia64__ +#pragma pack(8) +#endif + +// aggregator structure(43.4.5 in the 802.3ad standard) +typedef struct aggregator { + struct mac_addr aggregator_mac_address; + u16 aggregator_identifier; + u16 is_individual; // BOOLEAN + u16 actor_admin_aggregator_key; + u16 actor_oper_aggregator_key; + struct mac_addr partner_system; + u16 partner_system_priority; + u16 partner_oper_aggregator_key; + u16 receive_state; // BOOLEAN + u16 transmit_state; // BOOLEAN + struct port *lag_ports; + // ****** PRIVATE PARAMETERS ****** + struct slave *slave; // pointer to the bond slave that this aggregator belongs to + u16 is_active; // BOOLEAN. Indicates if this aggregator is active + u16 num_of_ports; +} aggregator_t; + +// port structure(43.4.6 in the 802.3ad standard) +typedef struct port { + u16 actor_port_number; + u16 actor_port_priority; + struct mac_addr actor_system; // This parameter is added here although it is not specified in the standard, just for simplification + u16 actor_system_priority; // This parameter is added here although it is not specified in the standard, just for simplification + u16 actor_port_aggregator_identifier; + u16 ntt; // BOOLEAN + u16 actor_admin_port_key; + u16 actor_oper_port_key; + u8 actor_admin_port_state; + u8 actor_oper_port_state; + struct mac_addr partner_admin_system; + struct mac_addr partner_oper_system; + u16 partner_admin_system_priority; + u16 partner_oper_system_priority; + u16 partner_admin_key; + u16 partner_oper_key; + u16 partner_admin_port_number; + u16 partner_oper_port_number; + u16 partner_admin_port_priority; + u16 partner_oper_port_priority; + u8 partner_admin_port_state; + u8 partner_oper_port_state; + u16 is_enabled; // BOOLEAN + // ****** PRIVATE PARAMETERS ****** + u16 sm_vars; // all state machines variables for this port + rx_states_t sm_rx_state; // state machine rx state + u16 sm_rx_timer_counter; // state machine rx timer counter + periodic_states_t sm_periodic_state;// state machine periodic state + u16 sm_periodic_timer_counter; // state machine periodic timer counter + mux_states_t sm_mux_state; // state machine mux state + u16 sm_mux_timer_counter; // state machine mux timer counter + tx_states_t sm_tx_state; // state machine tx state + u16 sm_tx_timer_counter; // state machine tx timer counter(allways on - enter to transmit state 3 time per second) + struct slave *slave; // pointer to the bond slave that this port belongs to + struct aggregator *aggregator; // pointer to an aggregator that this port related to + struct port *next_port_in_aggregator; // Next port on the linked list of the parent aggregator + u32 transaction_id; // continuous number for identification of Marker PDU's; + struct lacpdu lacpdu; // the lacpdu that will be sent for this port +} port_t; + +// system structure +typedef struct ad_system { + u16 sys_priority; + struct mac_addr sys_mac_addr; +} ad_system_t; + +#ifdef __ia64__ +#pragma pack() +#endif + +// ================= AD Exported structures to the main bonding code ================== +#define BOND_AD_INFO(bond) ((bond)->ad_info) +#define SLAVE_AD_INFO(slave) ((slave)->ad_info) + +struct ad_bond_info { + ad_system_t system; // 802.3ad system structure + u32 agg_select_timer; // Timer to select aggregator after all adapter's hand shakes + u32 agg_select_mode; // Mode of selection of active aggregator(bandwidth/count) + struct timer_list ad_timer; + struct packet_type ad_pkt_type; +}; + +struct ad_slave_info { + struct aggregator aggregator; // 802.3ad aggregator structure + struct port port; // 802.3ad port structure + spinlock_t rx_machine_lock; // To avoid race condition between callback and receive interrupt + u16 id; +}; + +// ================= AD Exported functions to the main bonding code ================== +void bond_3ad_initialize(struct bonding *bond, u16 tick_resolution); +int bond_3ad_bind_slave(struct slave *slave); +void bond_3ad_unbind_slave(struct slave *slave); +void bond_3ad_state_machine_handler(struct bonding *bond); +void bond_3ad_rx_indication(struct lacpdu *lacpdu, struct slave *slave, u16 length); +void bond_3ad_adapter_speed_changed(struct slave *slave); +void bond_3ad_adapter_duplex_changed(struct slave *slave); +void bond_3ad_link_status_changed(struct slave *slave, int status); +int bond_3ad_get_active_agg_info(struct bonding *bond, struct ad_info *ad_info); +int bond_3ad_xmit_xor(struct sk_buff *skb, struct net_device *dev); +int bond_3ad_lacpdu_recv(struct sk_buff *skb, struct net_device *dev, struct packet_type* ptype); +#endif //__BOND_3AD_H__ + diff -Nuarp linux-2.4.20-bonding-20030317/drivers/net/bonding/bonding.h linux-2.4.20-bonding-20030317-devel/drivers/net/bonding/bonding.h --- linux-2.4.20-bonding-20030317/drivers/net/bonding/bonding.h 2003-03-18 17:24:24.000000000 +0200 +++ linux-2.4.20-bonding-20030317-devel/drivers/net/bonding/bonding.h 2003-03-18 17:24:25.000000000 +0200 @@ -10,6 +10,11 @@ * This software may be used and distributed according to the terms * of the GNU Public License, incorporated herein by reference. * + * + * 2003/03/18 - Amir Noam , + * Tsippy Mendelson and + * Shmulik Hen + * - Added support for IEEE 802.3ad Dynamic link aggregatoin mode. */ #ifndef _LINUX_BONDING_H @@ -17,6 +22,36 @@ #include #include +#include "bond_3ad.h" + +#ifdef BONDING_DEBUG + +// use this like so: BOND_PRINT_DBG(("foo = %d, bar = %d", foo, bar)); +#define BOND_PRINT_DBG(X) \ +do { \ + printk(KERN_DEBUG "%s (%d)", __FUNCTION__, __LINE__); \ + printk X; \ + printk("\n"); \ +} while(0) + +#else +#define BOND_PRINT_DBG(X) +#endif /* BONDING_DEBUG */ + +#define IS_UP(dev) ((((dev)->flags & (IFF_UP)) == (IFF_UP)) && \ + (netif_running(dev) && netif_carrier_ok(dev))) + +/* Checks whether the dev is ready for transmit. We do not check netif_running */ +/* since a device can be stopped by the driver for short periods of time for */ +/* maintainance. dev_queue_xmit() handles this by queing the packet until the */ +/* the dev is running again. Keeping packets ordering requires sticking the the*/ +/* same dev as much as possible */ +#define SLAVE_IS_OK(slave) \ + ((((slave)->dev->flags & (IFF_UP)) == (IFF_UP)) && \ + netif_carrier_ok((slave)->dev) && \ + ((slave)->link == BOND_LINK_UP) && \ + ((slave)->state == BOND_STATE_ACTIVE)) + typedef struct slave { struct slave *next; @@ -31,6 +66,7 @@ typedef struct slave { u16 speed; u8 duplex; u8 perm_hwaddr[ETH_ALEN]; + struct ad_slave_info ad_info; // HUGE struct. maybe alloc dynamically } slave_t; /* @@ -62,7 +98,53 @@ typedef struct bonding { struct net_device *device; struct dev_mc_list *mc_list; unsigned short flags; + struct ad_bond_info ad_info; } bonding_t; +void bond_set_slave_active_flags(slave_t *slave); +void bond_set_slave_inactive_flags(slave_t *slave); + +//this function can be used for iterating the slave list (which is circular) +//must be locked with bond RW lock +extern inline struct slave* +bond_get_next_slave(struct bonding *bond, struct slave *slave) +{ + //If we have reached the last slave - return NULL + if (slave->next == bond->next) { + return NULL; + } + return slave->next; +} + +//must be locked with bond RW lock +//returns NULL if the net_device does not belong to any of the bond's slaves +extern inline struct slave* +bond_get_slave_by_dev(struct bonding *bond, struct net_device *slave_dev) +{ + struct slave *our_slave = bond->next; + + //check if the list of slaves is empty + if (our_slave == (slave_t *)bond) { + return NULL; + } + + for (; our_slave; our_slave = bond_get_next_slave(bond, our_slave)) { + if (our_slave->dev == slave_dev) { + break; + } + } + return our_slave; +} + +extern inline struct bonding* +bond_get_bond_by_slave(struct slave *slave) +{ + if (!slave || !slave->dev->master) { + return NULL; + } + + return (struct bonding *)(slave->dev->master->priv); +} + #endif /* _LINUX_BONDING_H */ diff -Nuarp linux-2.4.20-bonding-20030317/drivers/net/bonding/bond_main.c linux-2.4.20-bonding-20030317-devel/drivers/net/bonding/bond_main.c --- linux-2.4.20-bonding-20030317/drivers/net/bonding/bond_main.c 2003-03-18 17:24:24.000000000 +0200 +++ linux-2.4.20-bonding-20030317-devel/drivers/net/bonding/bond_main.c 2003-03-18 17:24:25.000000000 +0200 @@ -319,6 +319,11 @@ * properly. * - Block possibility of enslaving before the master is up. This * prevents putting the system in an unstable state. + * + * 2003/03/18 - Amir Noam , + * Tsippy Mendelson and + * Shmulik Hen + * - Added support for IEEE 802.3ad Dynamic link aggregatoin mode. */ #include @@ -359,6 +364,7 @@ #include #include #include "bonding.h" +#include "bond_3ad.h" #define DRV_VERSION "2.4.20-20030317" #define DRV_RELDATE "March 17, 2003" @@ -409,6 +415,7 @@ static struct bond_parm_tbl bond_mode_tb { "active-backup", BOND_MODE_ACTIVEBACKUP}, { "balance-xor", BOND_MODE_XOR}, { "broadcast", BOND_MODE_BROADCAST}, +{ "802.3ad", BOND_MODE_8023AD}, { NULL, -1}, }; @@ -464,8 +471,6 @@ static void bond_set_promiscuity(bonding static void bond_set_allmulti(bonding_t *bond, int inc); static struct dev_mc_list* bond_mc_list_find_dmi(struct dev_mc_list *dmi, struct dev_mc_list *mc_list); static void bond_mc_update(bonding_t *bond, slave_t *new, slave_t *old); -static void bond_set_slave_inactive_flags(slave_t *slave); -static void bond_set_slave_active_flags(slave_t *slave); static int bond_enslave(struct net_device *master, struct net_device *slave); static int bond_release(struct net_device *master, struct net_device *slave); static int bond_release_all(struct net_device *master); @@ -483,9 +488,6 @@ static int bond_get_info(char *buf, char /* several macros */ -#define IS_UP(dev) ((((dev)->flags & (IFF_UP)) == (IFF_UP)) && \ - (netif_running(dev) && netif_carrier_ok(dev))) - static void arp_send_all(slave_t *slave) { int i; @@ -510,6 +512,8 @@ bond_mode_name(void) return "load balancing (xor)"; case BOND_MODE_BROADCAST : return "fault-tolerance (broadcast)"; + case BOND_MODE_8023AD: + return "IEEE 802.3ad Dynamic link aggregation"; default : return "unknown"; } @@ -530,13 +534,13 @@ multicast_mode_name(void) } } -static void bond_set_slave_inactive_flags(slave_t *slave) +void bond_set_slave_inactive_flags(slave_t *slave) { slave->state = BOND_STATE_BACKUP; slave->dev->flags |= IFF_NOARP; } -static void bond_set_slave_active_flags(slave_t *slave) +void bond_set_slave_active_flags(slave_t *slave) { slave->state = BOND_STATE_ACTIVE; slave->dev->flags &= ~IFF_NOARP; @@ -815,8 +819,29 @@ static u16 bond_check_mii_link(bonding_t return (has_active_interface ? BMSR_LSTATUS : 0); } +//register to receive lacpdus on a bond +static void bond_register_lacpdu(struct bonding *bond) +{ + struct packet_type* pk_type = &(BOND_AD_INFO(bond).ad_pkt_type); + + //initialize packet type + pk_type->type = PKT_TYPE_LACPDU; + pk_type->dev = bond->device; + pk_type->func = bond_3ad_lacpdu_recv; + pk_type->data = (void*)1; // understand shared skbs + + dev_add_pack(pk_type); +} + +//register to receive lacpdus on a bond +static void bond_unregister_lacpdu(struct bonding *bond) +{ + dev_remove_pack(&(BOND_AD_INFO(bond).ad_pkt_type)); +} + static int bond_open(struct net_device *dev) { + struct bonding *bond = (struct bonding *)(dev->priv); struct timer_list *timer = &((struct bonding *)(dev->priv))->mii_timer; struct timer_list *arp_timer = &((struct bonding *)(dev->priv))->arp_timer; MOD_INC_USE_COUNT; @@ -840,6 +865,19 @@ static int bond_open(struct net_device * } add_timer(arp_timer); } + + if (bond_mode == BOND_MODE_8023AD) { + struct timer_list *ad_timer = &(BOND_AD_INFO(bond).ad_timer); + init_timer(ad_timer); + ad_timer->expires = jiffies + (AD_TIMER_INTERVAL * HZ / 1000); + ad_timer->data = (unsigned long)bond; + ad_timer->function = (void *)&bond_3ad_state_machine_handler; + add_timer(ad_timer); + + //register to receive LACPDUs + bond_register_lacpdu(bond); + } + return 0; } @@ -861,8 +899,18 @@ static int bond_close(struct net_device } } - /* Release the bonded slaves */ - bond_release_all(master); + if (bond_mode == BOND_MODE_8023AD) { + del_timer_sync(&(BOND_AD_INFO(bond).ad_timer)); + + //Unregister the receive of LACPDUs + bond_unregister_lacpdu(bond); + } + + if (bond->next != (struct slave *) bond) { + /* Release the bonded slaves */ + bond_release_all(master); + } + bond_mc_list_destroy (bond); write_unlock_irqrestore(&bond->lock, flags); @@ -880,6 +928,13 @@ static void bond_mc_list_flush(struct ne for (dmi = flush->mc_list; dmi != NULL; dmi = dmi->next) dev_mc_delete(dev, dmi->dmi_addr, dmi->dmi_addrlen, 0); + + if (bond_mode == BOND_MODE_8023AD) { + /*del lacpdu mc addr to mc list*/ + u8 lacpdu_multicast[ETH_ALEN] = MULTICAST_LACPDU_ADDR; + + dev_mc_delete(dev, lacpdu_multicast, ETH_ALEN, 0); + } } /* @@ -1240,6 +1295,13 @@ static int bond_enslave(struct net_devic dev_mc_add (slave_dev, dmi->dmi_addr, dmi->dmi_addrlen, 0); } + if (bond_mode == BOND_MODE_8023AD) { + /*add lacpdu mc addr to mc list*/ + u8 lacpdu_multicast[ETH_ALEN] = MULTICAST_LACPDU_ADDR; + + dev_mc_add(slave_dev, lacpdu_multicast, ETH_ALEN, 0); + } + write_lock_irqsave(&bond->lock, flags); bond_attach_slave(bond, new_slave); @@ -1299,6 +1361,11 @@ static int bond_enslave(struct net_devic "bond_enslave(): failed to get speed/duplex from %s, " "speed forced to 100Mbps, duplex forced to Full.\n", new_slave->dev->name); + if (bond_mode == BOND_MODE_8023AD) { + printk(KERN_WARNING + "Operation of 802.3ad mode requires ETHTOOL support " + "in base driver for proper aggregator selection.\n"); + } } /* if we're in active-backup mode, we need one and only one active @@ -1337,6 +1404,23 @@ static int bond_enslave(struct net_devic if (primary != NULL) if( strcmp(primary, new_slave->dev->name) == 0) bond->primary_slave = new_slave; + } else if (bond_mode == BOND_MODE_8023AD) { + /* in 802.3ad mode, the internal mechanism + will activate the slaves in the selected + aggregator */ + bond_set_slave_inactive_flags(new_slave); + //if this is the first slave + if (new_slave == bond->next) { + SLAVE_AD_INFO(new_slave).id = 1; + /*Initialize AD with the number of times that the AD timer is called in 1 second*/ + /*can be called only after the mac address of the bond is set*/ + bond_3ad_initialize(bond, 1000/AD_TIMER_INTERVAL); + } else { + SLAVE_AD_INFO(new_slave).id = + SLAVE_AD_INFO(new_slave->prev).id + 1; + } + + bond_3ad_bind_slave(new_slave); } else { #ifdef BONDING_DEBUG printk(KERN_CRIT "This slave is always active in trunk mode\n"); @@ -1601,6 +1685,12 @@ static int bond_release(struct net_devic old_current = bond->current_slave; while ((our_slave = our_slave->prev) != (slave_t *)bond) { if (our_slave->dev == slave) { + /* Inform AD package of unbinding of slave. */ + if (bond_mode == BOND_MODE_8023AD) { + bond_3ad_unbind_slave(our_slave); + } + + /* release the slave from its bond */ bond_detach_slave(bond, our_slave); printk (KERN_INFO "%s: releasing %s interface %s", @@ -1705,6 +1795,12 @@ static int bond_release_all(struct net_d bond->primary_slave = NULL; while ((our_slave = bond->prev) != (slave_t *)bond) { + /* Inform AD package of unbinding of slave + before slave is detached from the list. */ + if (bond_mode == BOND_MODE_8023AD) { + bond_3ad_unbind_slave(our_slave); + } + slave_dev = our_slave->dev; bond_detach_slave(bond, our_slave); @@ -1784,6 +1880,8 @@ static void bond_mii_monitor(struct net_ int mindelay = updelay + 1; struct net_device *dev = slave->dev; int link_state; + u16 old_speed = slave->speed; + u8 old_duplex = slave->duplex; link_state = bond_check_dev_link(dev, 0); @@ -1832,7 +1930,7 @@ static void bond_mii_monitor(struct net_ slave->link = BOND_LINK_DOWN; /* in active/backup mode, we must completely disable this interface */ - if (bond_mode == BOND_MODE_ACTIVEBACKUP) { + if ((bond_mode == BOND_MODE_ACTIVEBACKUP) || (bond_mode == BOND_MODE_8023AD)) { bond_set_slave_inactive_flags(slave); } printk(KERN_INFO @@ -1841,6 +1939,11 @@ static void bond_mii_monitor(struct net_ master->name, dev->name); + //notify ad that the link status has changed + if (bond_mode == BOND_MODE_8023AD) { + bond_3ad_link_status_changed(slave, 0); + } + read_lock(&bond->ptrlock); if (slave == bond->current_slave) { read_unlock(&bond->ptrlock); @@ -1911,8 +2014,12 @@ static void bond_mii_monitor(struct net_ /* now the link has been up for long time enough */ slave->link = BOND_LINK_UP; slave->jiffies = jiffies; - - if (bond_mode != BOND_MODE_ACTIVEBACKUP) { + + if (bond_mode == BOND_MODE_8023AD) { + /* prevent it from being the active one */ + slave->state = BOND_STATE_BACKUP; + } + else if (bond_mode != BOND_MODE_ACTIVEBACKUP) { /* make it immediately active */ slave->state = BOND_STATE_ACTIVE; } else if (slave != bond->primary_slave) { @@ -1926,7 +2033,12 @@ static void bond_mii_monitor(struct net_ master->name, dev->name); - if ( (bond->primary_slave != NULL) + //notify ad that the link status has changed + if (bond_mode == BOND_MODE_8023AD) { + bond_3ad_link_status_changed(slave, 1); + } + + if ( (bond->primary_slave != NULL) && (slave == bond->primary_slave) ) change_active_interface(bond); } @@ -1950,7 +2062,16 @@ static void bond_mii_monitor(struct net_ } /* end of switch */ bond_update_speed_duplex(slave); - + + if (bond_mode == BOND_MODE_8023AD) { + if (old_speed != slave->speed) { + bond_3ad_adapter_speed_changed(slave); + } + if (old_duplex != slave->duplex) { + bond_3ad_adapter_duplex_changed(slave); + } + } + } /* end of while */ /* @@ -1978,12 +2099,17 @@ static void bond_mii_monitor(struct net_ bestslave->delay = 0; bestslave->link = BOND_LINK_UP; bestslave->jiffies = jiffies; + + //notify ad that the link status has changed + if (bond_mode == BOND_MODE_8023AD) { + bond_3ad_link_status_changed(bestslave, 1); + } } if (bond_mode == BOND_MODE_ACTIVEBACKUP) { bond_set_slave_active_flags(bestslave); bond_mc_update(bond, bestslave, NULL); - } else { + } else if (bond_mode != BOND_MODE_8023AD) { bestslave->state = BOND_STATE_ACTIVE; } write_lock(&bond->ptrlock); @@ -2956,6 +3082,31 @@ static int bond_get_info(char *buf, char multicast_mode_name()); read_lock_irqsave(&bond->lock, flags); + + if (bond_mode == BOND_MODE_8023AD) { + struct ad_info ad_info; + + len += sprintf(buf + len, "\n802.3ad info\n"); + + if (bond_3ad_get_active_agg_info(bond, &ad_info)) { + len += sprintf(buf + len, "bond %s has no active aggregator\n", bond->device->name); + } else { + len += sprintf(buf + len, "Active Aggregator Info:\n"); + + len += sprintf(buf + len, "\tAggregator ID: %d\n", ad_info.aggregator_id); + len += sprintf(buf + len, "\tNumber of ports: %d\n", ad_info.ports); + len += sprintf(buf + len, "\tActor Key: %d\n", ad_info.actor_key); + len += sprintf(buf + len, "\tPartner Key: %d\n", ad_info.partner_key); + len += sprintf(buf + len, "\tPartner Mac Address: %02x:%02x:%02x:%02x:%02x:%02x\n", + ad_info.partner_system[0], + ad_info.partner_system[1], + ad_info.partner_system[2], + ad_info.partner_system[3], + ad_info.partner_system[4], + ad_info.partner_system[5]); + } + } + for (slave = bond->prev; slave != (slave_t *)bond; slave = slave->prev) { len += sprintf(buf + len, "\nSlave Interface: %s\n", slave->dev->name); @@ -2976,6 +3127,17 @@ static int bond_get_info(char *buf, char slave->perm_hwaddr[3], slave->perm_hwaddr[4], slave->perm_hwaddr[5]); + + if (bond_mode == BOND_MODE_8023AD) { + struct aggregator *agg = SLAVE_AD_INFO(slave).port.aggregator; + + if (agg) { + len += sprintf(buf + len, "Aggregator ID: %d\n", + agg->aggregator_identifier); + } else { + len += sprintf(buf + len, "Aggregator ID: N/A\n"); + } + } } read_unlock_irqrestore(&bond->lock, flags); @@ -3093,6 +3255,9 @@ static int __init bond_init(struct net_d case BOND_MODE_BROADCAST: dev->hard_start_xmit = bond_xmit_broadcast; break; + case BOND_MODE_8023AD: + dev->hard_start_xmit = bond_3ad_xmit_xor; + break; default: printk(KERN_ERR "Unknown bonding mode %d\n", bond_mode); kfree(bond->stats); @@ -3280,6 +3445,35 @@ static int __init bonding_init(void) downdelay = 0; } + /* reset values for 802.3ad */ + if (bond_mode == BOND_MODE_8023AD) { + if (arp_interval != 0) { + printk(KERN_WARNING "bonding_init(): ARP monitoring" + "can't be used simultaneously with 802.3ad, " + "disabling ARP monitoring\n" + ); + arp_interval = 0; + } + + if (miimon == 0) { + printk(KERN_ERR + "bonding_init(): miimon must be specified, " + "otherwise bonding will not detect link failure, " + "speed and duplex which are essential " + "for 802.3ad operation" + "Forcing miimon to 100msec\n"); + miimon = 100; + } + + if (multicast_mode != BOND_MULTICAST_ALL) { + printk(KERN_ERR + "bonding_init(): Multicast mode must " + "be set to ALL for 802.3ad, " + "Forcing Multicast mode to ALL\n"); + multicast_mode = BOND_MULTICAST_ALL; + } + } + if (miimon == 0) { if ((updelay != 0) || (downdelay != 0)) { /* just warn the user the up/down delay will have diff -Nuarp linux-2.4.20-bonding-20030317/drivers/net/bonding/Makefile linux-2.4.20-bonding-20030317-devel/drivers/net/bonding/Makefile --- linux-2.4.20-bonding-20030317/drivers/net/bonding/Makefile 2003-03-18 17:24:24.000000000 +0200 +++ linux-2.4.20-bonding-20030317-devel/drivers/net/bonding/Makefile 2003-03-18 17:24:25.000000000 +0200 @@ -4,7 +4,8 @@ O_TARGET := bonding.o -obj-y := bond_main.o +obj-y := bond_main.o \ + bond_3ad.o obj-m := $(O_TARGET) diff -Nuarp linux-2.4.20-bonding-20030317/include/linux/if_bonding.h linux-2.4.20-bonding-20030317-devel/include/linux/if_bonding.h --- linux-2.4.20-bonding-20030317/include/linux/if_bonding.h 2003-03-18 17:24:24.000000000 +0200 +++ linux-2.4.20-bonding-20030317-devel/include/linux/if_bonding.h 2003-03-18 17:24:25.000000000 +0200 @@ -23,6 +23,11 @@ * 2003/03/18 - Tsippy Mendelson and * Amir Noam * - Moved driver's private data types to bonding.h + * + * 2003/03/18 - Amir Noam , + * Tsippy Mendelson and + * Shmulik Hen + * - Added support for IEEE 802.3ad Dynamic link aggregatoin mode. */ #ifndef _LINUX_IF_BONDING_H @@ -49,6 +54,7 @@ #define BOND_MODE_ACTIVEBACKUP 1 #define BOND_MODE_XOR 2 #define BOND_MODE_BROADCAST 3 +#define BOND_MODE_8023AD 4 /* each slave's link has 4 states */ #define BOND_LINK_UP 0 /* link is up and running */ @@ -81,6 +87,14 @@ typedef struct ifslave __u32 link_failure_count; } ifslave; +struct ad_info { + __u16 aggregator_id; + __u16 ports; + __u16 actor_key; + __u16 partner_key; + __u8 partner_system[ETH_ALEN]; +}; + #endif /* _LINUX_IF_BONDING_H */ /* -- | Shmulik Hen | | Israel Design Center (Jerusalem) | | LAN Access Division | | Intel Communications Group, Intel corp. | | | | Anti-Spam: shmulik dot hen at intel dot com | From kuznet@ms2.inr.ac.ru Thu Mar 20 07:50:12 2003 Received: with ECARTIS (v1.0.0; list netdev); Thu, 20 Mar 2003 07:50:16 -0800 (PST) Received: from sex.inr.ac.ru (sex.inr.ac.ru [193.233.7.165]) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h2KFo8q9028733 for ; Thu, 20 Mar 2003 07:50:10 -0800 Received: (from kuznet@localhost) by sex.inr.ac.ru (8.6.13/ANK) id SAA10972; Thu, 20 Mar 2003 18:49:49 +0300 From: kuznet@ms2.inr.ac.ru Message-Id: <200303201549.SAA10972@sex.inr.ac.ru> Subject: Re: TCP/IPv6 broken in Linux 2.5.64? To: ahu@ds9a.NL (bert hubert) Date: Thu, 20 Mar 2003 18:49:49 +0300 (MSK) Cc: netdev@oss.sgi.com In-Reply-To: <20030318162532.GA9705@outpost.ds9a.nl> from "bert hubert" at Mar 18, 3 07:45:02 pm X-Mailer: ELM [version 2.4 PL24] MIME-Version: 1.0 X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) X-archive-position: 1997 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: kuznet@ms2.inr.ac.ru Precedence: bulk X-list: netdev Content-Length: 756 Lines: 22 Hello! > irc servers, or an IPv6 zonetransfer. However, when I try to ssh from 2.5.65 Try this. I have just found this lost patch, it is from 2.4 tree, but it should fit to 2.5 as well. Alexey ===== net/ipv6/tcp_ipv6.c 1.19 vs edited ===== --- 1.19/net/ipv6/tcp_ipv6.c Thu Jan 23 21:14:18 2003 +++ edited/net/ipv6/tcp_ipv6.c Thu Mar 20 18:44:17 2003 @@ -983,7 +983,7 @@ struct ipv6_pinfo *np = &sk->net_pinfo.af_inet6; if (skb->ip_summed == CHECKSUM_HW) { - th->check = csum_ipv6_magic(&np->saddr, &np->daddr, len, IPPROTO_TCP, 0); + th->check = ~csum_ipv6_magic(&np->saddr, &np->daddr, len, IPPROTO_TCP, 0); skb->csum = offsetof(struct tcphdr, check); } else { th->check = csum_ipv6_magic(&np->saddr, &np->daddr, len, IPPROTO_TCP, From ahu@outpost.ds9a.nl Thu Mar 20 07:55:32 2003 Received: with ECARTIS (v1.0.0; list netdev); Thu, 20 Mar 2003 07:55:43 -0800 (PST) Received: from outpost.ds9a.nl (postfix@outpost.ds9a.nl [213.244.168.210]) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h2KFsfq9029102 for ; Thu, 20 Mar 2003 07:55:22 -0800 Received: by outpost.ds9a.nl (Postfix, from userid 1000) id 21F8A45A0; Thu, 20 Mar 2003 16:19:12 +0100 (CET) Date: Thu, 20 Mar 2003 16:19:12 +0100 From: bert hubert To: netdev@oss.sgi.com, linux-kernel@vger.kernel.org Cc: davem@redhat.com Subject: [some more data] Re: [BUG] 2.5.65 ipv6 TCP checksum errors (capture attached) Message-ID: <20030320151912.GA25487@outpost.ds9a.nl> Mail-Followup-To: bert hubert , netdev@oss.sgi.com, linux-kernel@vger.kernel.org, davem@redhat.com References: <20030319124533.GA14363@outpost.ds9a.nl> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="NzB8fVQJ5HfG6fxh" Content-Disposition: inline In-Reply-To: <20030319124533.GA14363@outpost.ds9a.nl> User-Agent: Mutt/1.3.28i X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) X-archive-position: 1998 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: ahu@ds9a.nl Precedence: bulk X-list: netdev Content-Length: 3833 Lines: 69 --NzB8fVQJ5HfG6fxh Content-Type: text/plain; charset=us-ascii Content-Disposition: inline On Wed, Mar 19, 2003 at 01:45:33PM +0100, bert hubert wrote: > Interestingly, the initial ssh connection worked, the second one failed. > Subsequent attempts fail too. I've since let loose the excellent ethereal on this and found out: > hubert# tcpdump -r file -v -v > 29.09 snapcount.33408 > hubert.ssh: S [tcp sum ok] 2737328594:2737328594(0) win 5760 (len 40, hlim 64) > 29.09 hubert.ssh > snapcount.33408: S [tcp sum ok] 2399386333:2399386333(0) ack 2737328595 win 5712 (len 40, hlim 64) > 29.09 snapcount.33408 > hubert.ssh: . [tcp sum ok] 1:1(0) ack 1 win 5760 (len 32, hlim 64) So far so good. > 29.10 hubert.ssh > snapcount.33408: P [bad tcp cksum 4f2!] 1:41(40) ack 1 win 5712 (len 72, hlim 64) > 29.30 hubert.ssh > snapcount.33408: P [bad tcp cksum 3bf1!] 1:41(40) ack 1 win 5712 (len 72, hlim 64) > 29.83 hubert.ssh > snapcount.33408: P [bad tcp cksum 23ef!] 1:41(40) ack 1 win 5712 (len 72, hlim 64) > 30.86 hubert.ssh > snapcount.33408: P [bad tcp cksum 23eb!] 1:41(40) ack 1 win 5712 (len 72, hlim 64) These packets all have an identical csum of 0x680d, it is not being updated. So, the SYN/SYNACK/ACK stuff went fine, the initial data however has a wrong checksum. For completeness, I've attached the capture again. Regards, bert -- http://www.PowerDNS.com Open source, database driven DNS Software http://lartc.org Linux Advanced Routing & Traffic Control HOWTO http://netherlabs.nl Consulting --NzB8fVQJ5HfG6fxh Content-Type: application/octet-stream Content-Disposition: attachment; filename=bad-csum Content-Transfer-Encoding: base64 1MOyoQIABAAAAAAAAAAAANwFAAABAAAAiWR4Pht3AQBeAAAAXgAAAAAIoRnw8ACgzMjyXIbd YAAAAAAoBkAgAQiIEDYAAAIIof/+GfDxIAEIiBA2AAACCKH//hnw8IKAABajKFHSAAAAAKAC FoACJAAAAgQFoAQCCAoAox+fAAAAAAEDAwCJZHg+wHcBAF4AAABeAAAAAKDMyPJcAAihGfDw ht1gAAAAACgGQCABCIgQNgAAAgih//4Z8PAgAQiIEDYAAAIIof/+GfDxABaCgI8Dut2jKFHT oBIWUOuhAAACBAWgBAIICgClzBoAox+fAQMDAIlkeD5KeAEAVgAAAFYAAAAACKEZ8PAAoMzI 8lyG3WAAAAAAIAZAIAEIiBA2AAACCKH//hnw8SABCIgQNgAAAgih//4Z8PCCgAAWoyhR048D ut6AEBaAGiIAAAEBCAoAox+gAKXMGolkeD5XigEAfgAAAH4AAAAAoMzI8lwACKEZ8PCG3WAA AAAASAZAIAEIiBA2AAACCKH//hnw8CABCIgQNgAAAgih//4Z8PEAFoKAjwO63qMoUdOAGBZQ aA0AAAEBCAoApcwfAKMfoFNTSC0xLjk5LU9wZW5TU0hfMy41cDEgRGViaWFuIDE6My41cDEt NQqJZHg+7pkEAH4AAAB+AAAAAKDMyPJcAAihGfDwht1gAAAAAEgGQCABCIgQNgAAAgih//4Z 8PAgAQiIEDYAAAIIof/+GfDxABaCgI8Dut6jKFHTgBgWUGgNAAABAQgKAKXM6ACjH6BTU0gt MS45OS1PcGVuU1NIXzMuNXAxIERlYmlhbiAxOjMuNXAxLTUKiWR4Pv/GDAB+AAAAfgAAAACg zMjyXAAIoRnw8IbdYAAAAABIBkAgAQiIEDYAAAIIof/+GfDwIAEIiBA2AAACCKH//hnw8QAW goCPA7reoyhR04AYFlBoDQAAAQEICgClzwAAox+gU1NILTEuOTktT3BlblNTSF8zLjVwMSBE ZWJpYW4gMTozLjVwMS01CopkeD4pJA0AfgAAAH4AAAAAoMzI8lwACKEZ8PCG3WAAAAAASAZA IAEIiBA2AAACCKH//hnw8CABCIgQNgAAAgih//4Z8PEAFoKAjwO63qMoUdOAGBZQaA0AAAEB CAoApdMAAKMfoFNTSC0xLjk5LU9wZW5TU0hfMy41cDEgRGViaWFuIDE6My41cDEtNQqMZHg+ 0/QJAH4AAAB+AAAAAKDMyPJcAAihGfDwht1gAAAAAEgGQCABCIgQNgAAAgih//4Z8PAgAQiI EDYAAAIIof/+GfDxABaCgI8Dut6jKFHTgBgWUGgNAAABAQgKAKXaAACjH6BTU0gtMS45OS1P cGVuU1NIXzMuNXAxIERlYmlhbiAxOjMuNXAxLTUKj2R4PkXxDgB+AAAAfgAAAACgzMjyXAAI oRnw8IbdYAAAAABIBkAgAQiIEDYAAAIIof/+GfDwIAEIiBA2AAACCKH//hnw8QAWgoCPA7re oyhR04AYFlBoDQAAAQEICgCl5wAAox+gU1NILTEuOTktT3BlblNTSF8zLjVwMSBEZWJpYW4g MTozLjVwMS01CpZkeD6mqAkAfgAAAH4AAAAAoMzI8lwACKEZ8PCG3WAAAAAASAZAIAEIiBA2 AAACCKH//hnw8CABCIgQNgAAAgih//4Z8PEAFoKAjwO63qMoUdOAGBZQaA0AAAEBCAoApgEA AKMfoFNTSC0xLjk5LU9wZW5TU0hfMy41cDEgRGViaWFuIDE6My41cDEtNQo= --NzB8fVQJ5HfG6fxh-- From garzik@gtf.org Thu Mar 20 08:57:06 2003 Received: with ECARTIS (v1.0.0; list netdev); Thu, 20 Mar 2003 08:57:12 -0800 (PST) Received: from havoc.gtf.org (havoc.daloft.com [64.213.145.173]) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h2KGuPq9002397 for ; Thu, 20 Mar 2003 08:57:05 -0800 Received: by havoc.gtf.org (Postfix, from userid 500) id A6DD96654; Thu, 20 Mar 2003 16:56:18 +0000 (US/Central) Date: Thu, 20 Mar 2003 11:56:18 -0500 From: Jeff Garzik To: Shmulik Hen Cc: Bonding Developement list , Bonding Announce list , Linux Net Mailing list , Linux Kernel Mailing list , Oss SGI Netdev list Subject: Re: [patch] (0/8) Adding 802.3ad support to bonding Message-ID: <20030320165618.GB8256@gtf.org> References: Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.3.28i X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) X-archive-position: 1999 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jgarzik@pobox.com Precedence: bulk X-list: netdev Content-Length: 300 Lines: 15 I (and many others) will be going over these patches. I also see that somebody (davem?) applied your divide-by-zero patch to the mainline kernel. My initial comment is that we will want to work to eliminate these ifdefs. Other comments will follow. Thanks to Intel for these efforts! Jeff From ahu@outpost.ds9a.nl Thu Mar 20 09:01:32 2003 Received: with ECARTIS (v1.0.0; list netdev); Thu, 20 Mar 2003 09:01:35 -0800 (PST) Received: from outpost.ds9a.nl (postfix@outpost.ds9a.nl [213.244.168.210]) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h2KH1Lq9003191 for ; Thu, 20 Mar 2003 09:01:22 -0800 Received: by outpost.ds9a.nl (Postfix, from userid 1000) id DEC313FDD; Thu, 20 Mar 2003 17:31:07 +0100 (CET) Date: Thu, 20 Mar 2003 17:31:07 +0100 From: bert hubert To: kuznet@ms2.inr.ac.ru Cc: netdev@oss.sgi.com Subject: Re: TCP/IPv6 broken in Linux 2.5.64? Message-ID: <20030320163107.GA28229@outpost.ds9a.nl> Mail-Followup-To: bert hubert , kuznet@ms2.inr.ac.ru, netdev@oss.sgi.com References: <20030318162532.GA9705@outpost.ds9a.nl> <200303201549.SAA10972@sex.inr.ac.ru> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <200303201549.SAA10972@sex.inr.ac.ru> User-Agent: Mutt/1.3.28i X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) X-archive-position: 2000 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: ahu@ds9a.nl Precedence: bulk X-list: netdev Content-Length: 687 Lines: 21 On Thu, Mar 20, 2003 at 06:49:49PM +0300, kuznet@ms2.inr.ac.ru wrote: > Hello! > > > irc servers, or an IPv6 zonetransfer. However, when I try to ssh from 2.5.65 > > Try this. I have just found this lost patch, it is from 2.4 tree, but > it should fit to 2.5 as well. This has solved my problem 100%. The earlier mentioned hostA and hostB can now cheerfully connect to eachother. Everything else I try works too. So I suggest this be sent linus-wards. Thanks! Regards, bert -- http://www.PowerDNS.com Open source, database driven DNS Software http://lartc.org Linux Advanced Routing & Traffic Control HOWTO http://netherlabs.nl Consulting From hshmulik@intel.com Thu Mar 20 09:57:59 2003 Received: with ECARTIS (v1.0.0; list netdev); Thu, 20 Mar 2003 09:58:06 -0800 (PST) Received: from caduceus.fm.intel.com (fmr02.intel.com [192.55.52.25]) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h2KHvwq9003999 for ; Thu, 20 Mar 2003 09:57:59 -0800 Received: from talaria.fm.intel.com (talaria.fm.intel.com [10.1.192.39]) by caduceus.fm.intel.com (8.11.6/8.11.6/d: outer.mc,v 1.51 2002/09/23 20:43:23 dmccart Exp $) with ESMTP id h2KHpH621838 for ; Thu, 20 Mar 2003 17:51:21 GMT Received: from fmsmsxv040-1.fm.intel.com (fmsmsxvs040.fm.intel.com [132.233.42.124]) by talaria.fm.intel.com (8.11.6/8.11.6/d: inner.mc,v 1.28 2003/01/13 19:44:39 dmccart Exp $) with SMTP id h2KFEpc05915 for ; Thu, 20 Mar 2003 15:14:51 GMT Received: from jrslxjul4.npdj.intel.com ([10.12.254.188]) by fmsmsxv040-1.fm.intel.com (NAVGW 2.5.2.11) with SMTP id M2003032007133815638 ; Thu, 20 Mar 2003 07:13:40 -0800 Date: Thu, 20 Mar 2003 17:13:23 +0200 (IST) From: Shmulik Hen X-X-Sender: hshmulik@jrslxjul4.npdj.intel.com To: Bonding Developement list , Bonding Announce list , Linux Net Mailing list , Linux Kernel Mailing list , Oss SGI Netdev list , Jeff Garzik Subject: [Bonding][patch set] - Adding IEEE 802.3ad Dynamic link aggregation support Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) X-archive-position: 2001 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: hshmulik@intel.com Precedence: bulk X-list: netdev Content-Length: 1228 Lines: 31 Hello, The following set of 7(+2) patches add support for 802.3ad link aggregation mode on top of the latest release of bonding from source-forge (2.4.20-20030317). They also handle a set of bug fixes that were discovered during the past several weeks of an extensive testing effort done by our QA group. This comes as one of several enhancements Intel has decided to contribute to the open source community. This code is ported from our iANS product which has been around for some time. We are in the process of porting our advanced networking features from iANS to the bonding driver. In future releases we plan to add more features, improvements and adapting the code for 2.5.x kernels. The first 2 patches add support for point-to-point protocols to the 2.4.20/2.4.21-pre5 kernels in the net subtree, and are a pre-requisite for the 802.3ad feature. The following patches only modify the bonding files. -- | Shmulik Hen | | Israel Design Center (Jerusalem) | | LAN Access Division | | Intel Communications Group, Intel corp. | | | | Anti-Spam: shmulik dot hen at intel dot com | From erik@hensema.net Thu Mar 20 10:10:51 2003 Received: with ECARTIS (v1.0.0; list netdev); Thu, 20 Mar 2003 10:10:54 -0800 (PST) Received: from dexter.hensema.net (cc78409-a.hnglo1.ov.home.nl [212.120.97.185]) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h2KIA7q9004459 for ; Thu, 20 Mar 2003 10:10:50 -0800 Received: from bender.home.hensema.net (bender.ipv6.hensema.net [IPv6:2001:888:10a1:0:202:44ff:fe69:60f5]) by dexter.hensema.net (8.12.3/8.12.3) with ESMTP id h2KIA5PA009275; Thu, 20 Mar 2003 19:10:05 +0100 Received: from bender.home.hensema.net (localhost [127.0.0.1]) by bender.home.hensema.net (8.12.3/8.12.3) with ESMTP id h2KIA5dN019976; Thu, 20 Mar 2003 19:10:05 +0100 Received: (from erik@localhost) by bender.home.hensema.net (8.12.3/8.12.3/Submit) id h2KIA50J019975; Thu, 20 Mar 2003 19:10:05 +0100 Date: Thu, 20 Mar 2003 19:10:05 +0100 From: Erik Hensema To: bert hubert Cc: netdev@oss.sgi.com Subject: Re: TCP/IPv6 broken in Linux 2.5.64? Message-ID: <20030320181004.GA19970@hensema.net> Reply-To: erik@hensema.net References: <20030318162532.GA9705@outpost.ds9a.nl> <200303201549.SAA10972@sex.inr.ac.ru> <20030320163107.GA28229@outpost.ds9a.nl> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20030320163107.GA28229@outpost.ds9a.nl> User-Agent: Mutt/1.3.27i X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) X-archive-position: 2002 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: erik@hensema.net Precedence: bulk X-list: netdev Content-Length: 644 Lines: 18 On Thu, Mar 20, 2003 at 05:31:07PM +0100, bert hubert wrote: > On Thu, Mar 20, 2003 at 06:49:49PM +0300, kuznet@ms2.inr.ac.ru wrote: > > Hello! > > > > > irc servers, or an IPv6 zonetransfer. However, when I try to ssh from 2.5.65 > > > > Try this. I have just found this lost patch, it is from 2.4 tree, but > > it should fit to 2.5 as well. > > This has solved my problem 100%. The earlier mentioned hostA and hostB can > now cheerfully connect to eachother. Everything else I try works too. > > So I suggest this be sent linus-wards. Thanks! I can confirm that this fixed my IPv6 problems on 2.5.x. -- Erik Hensema (erik@hensema.net) From joern@wohnheim.fh-wedel.de Thu Mar 20 14:33:40 2003 Received: with ECARTIS (v1.0.0; list netdev); Thu, 20 Mar 2003 14:33:45 -0800 (PST) Received: from wohnheim.fh-wedel.de (wohnheim.fh-wedel.de [195.37.86.122]) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h2KMXdq9023743 for ; Thu, 20 Mar 2003 14:33:40 -0800 Received: from joern by wohnheim.fh-wedel.de with local (Exim 3.35 #1 (Debian)) id 18w8bJ-0005iA-00; Thu, 20 Mar 2003 23:33:29 +0100 Date: Thu, 20 Mar 2003 23:33:29 +0100 From: =?iso-8859-1?Q?J=F6rn?= Engel To: linux-kernel@vger.kernel.org Cc: netdev@oss.sgi.com, acme@conectiva.com.br Subject: [PATCH] clean up net/802/Makefile (small version) Message-ID: <20030320223329.GB13641@wohnheim.fh-wedel.de> Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit User-Agent: Mutt/1.3.28i X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) X-archive-position: 2003 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: joern@wohnheim.fh-wedel.de Precedence: bulk X-list: netdev Content-Length: 653 Lines: 27 This patch simply removes a couple of lines with duplicated functionality. Patch is against 2.4.20. Arnaldo, are you the correct maintainer for this? Jörn -- Victory in war is not repetitious. -- Sun Tzu --- linux-2.4.20/net/802/Makefile Sat Aug 3 02:39:46 2002 +++ linux-2.4.20/net/802/Makefile.1 Thu Mar 20 23:20:05 2003 @@ -15,13 +15,9 @@ obj-$(CONFIG_SYSCTL) += sysctl_net_802.o obj-$(CONFIG_LLC) += llc_sendpdu.o llc_utility.o cl2llc.o llc_macinit.o -ifeq ($(CONFIG_SYSCTL),y) -obj-y += sysctl_net_802.o -endif ifeq ($(CONFIG_LLC),y) subdir-y += transit -obj-y += llc_sendpdu.o llc_utility.o cl2llc.o llc_macinit.o SNAP = y endif From joern@wohnheim.fh-wedel.de Thu Mar 20 14:35:48 2003 Received: with ECARTIS (v1.0.0; list netdev); Thu, 20 Mar 2003 14:35:51 -0800 (PST) Received: from wohnheim.fh-wedel.de (wohnheim.fh-wedel.de [195.37.86.122]) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h2KMZlq9023908 for ; Thu, 20 Mar 2003 14:35:48 -0800 Received: from joern by wohnheim.fh-wedel.de with local (Exim 3.35 #1 (Debian)) id 18w8dS-0007EF-00; Thu, 20 Mar 2003 23:35:42 +0100 Date: Thu, 20 Mar 2003 23:35:42 +0100 From: =?iso-8859-1?Q?J=F6rn?= Engel To: linux-kernel@vger.kernel.org Cc: netdev@oss.sgi.com, acme@conectiva.com.br Subject: Re: [PATCH] clean up net/802/Makefile (large version) Message-ID: <20030320223542.GC13641@wohnheim.fh-wedel.de> References: <20030320223329.GB13641@wohnheim.fh-wedel.de> Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <20030320223329.GB13641@wohnheim.fh-wedel.de> User-Agent: Mutt/1.3.28i X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) X-archive-position: 2004 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: joern@wohnheim.fh-wedel.de Precedence: bulk X-list: netdev Content-Length: 1420 Lines: 77 This one tries to clean up the other code as well. Jörn -- The cheapest, fastest and most reliable components of a computer system are those that aren't there. -- Gordon Bell, DEC labratories --- linux-2.4.20/net/802/Makefile Sat Aug 3 02:39:46 2002 +++ linux-2.4.20/net/802/Makefile.2 Thu Mar 20 23:26:22 2003 @@ -11,48 +11,26 @@ export-objs = llc_macinit.o p8022.o psnap.o +snap-objs = p8022.o psnap.o + obj-y = p8023.o obj-$(CONFIG_SYSCTL) += sysctl_net_802.o -obj-$(CONFIG_LLC) += llc_sendpdu.o llc_utility.o cl2llc.o llc_macinit.o -ifeq ($(CONFIG_SYSCTL),y) -obj-y += sysctl_net_802.o -endif - -ifeq ($(CONFIG_LLC),y) -subdir-y += transit -obj-y += llc_sendpdu.o llc_utility.o cl2llc.o llc_macinit.o -SNAP = y -endif - -ifdef CONFIG_TR -obj-y += tr.o - SNAP=y -endif - -ifdef CONFIG_NET_FC -obj-y += fc.o -endif - -ifdef CONFIG_FDDI -obj-y += fddi.o -endif - -ifdef CONFIG_HIPPI -obj-y += hippi.o -endif - -ifdef CONFIG_IPX - SNAP=y -endif - -ifdef CONFIG_ATALK - SNAP=y -endif - -ifeq ($(SNAP),y) -obj-y += p8022.o psnap.o -endif +obj-$(CONFIG_LLC) += llc_sendpdu.o llc_utility.o cl2llc.o llc_macinit.o $(snap-objs) + +subdir-$(CONFIG_LLC) += transit + +obj-$(CONFIG_TR) += tr.o $(snap-objs) + +obj-$(CONFIG_NET_FC) += fc.o + +obj-$(CONFIG_FDDI) += fddi.o + +obj-$(CONFIG_HIPPI) += hippi.o + +obj-$(CONFIG_IPX) += $(snap-objs) + +obj-$(CONFIG_ATALK) += $(snap-objs) include $(TOPDIR)/Rules.make From fubar@us.ibm.com Thu Mar 20 14:54:41 2003 Received: with ECARTIS (v1.0.0; list netdev); Thu, 20 Mar 2003 14:54:46 -0800 (PST) Received: from e34.co.us.ibm.com (e34.co.us.ibm.com [32.97.110.132]) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h2KMrsq9024540 for ; Thu, 20 Mar 2003 14:54:41 -0800 Received: from westrelay02.boulder.ibm.com (westrelay02.boulder.ibm.com [9.17.195.11]) by e34.co.us.ibm.com (8.12.8/8.12.2) with ESMTP id h2KMrY64075902; Thu, 20 Mar 2003 17:53:34 -0500 Received: from d03nm121.boulder.ibm.com (d03av02.boulder.ibm.com [9.17.193.82]) by westrelay02.boulder.ibm.com (8.12.8/NCO/VER6.5) with ESMTP id h2KMrYEc084888; Thu, 20 Mar 2003 15:53:34 -0700 Importance: Normal Sensitivity: Subject: Re: [Bonding-devel] [patch] (2/8) Add 802.3ad support to bonding (released to bonding on sourceforge) To: Shmulik Hen Cc: Bonding Developement list , Bonding Announce list , Linux Net Mailing list , Linux Kernel Mailing list , Oss SGI Netdev list , Jeff Garzik X-Mailer: Lotus Notes Release 5.0.7 March 21, 2001 Message-ID: From: Jay Vosburgh Date: Thu, 20 Mar 2003 14:53:14 -0800 X-MIMETrack: Serialize by Router on D03NM121/03/M/IBM(Release 6.0 [IBM]|December 16, 2002) at 03/20/2003 15:53:34 MIME-Version: 1.0 Content-type: text/plain; charset=US-ASCII X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) X-archive-position: 2005 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: fubar@us.ibm.com Precedence: bulk X-list: netdev Content-Length: 315 Lines: 14 I have incorporated Shmulik Hen's bug fix patches to bonding (patch numbers 2 and 3) into the current code and released the new patch to sourceforge.net/projects/bonding. The current bonding update is bonding-2.4.20-20030320. The only changes I made were minor spelling / formatting fixes. -J From davem@redhat.com Thu Mar 20 16:11:23 2003 Received: with ECARTIS (v1.0.0; list netdev); Thu, 20 Mar 2003 16:11:31 -0800 (PST) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h2L0Agq9001959 for ; Thu, 20 Mar 2003 16:11:23 -0800 Received: from localhost (IDENT:davem@localhost.localdomain [127.0.0.1]) by pizda.ninka.net (8.9.3/8.9.3) with ESMTP id QAA14618; Thu, 20 Mar 2003 16:08:46 -0800 Date: Thu, 20 Mar 2003 16:08:45 -0800 (PST) Message-Id: <20030320.160845.121240938.davem@redhat.com> To: fubar@us.ibm.com Cc: hshmulik@intel.com, bonding-devel@lists.sourceforge.net, bonding-announce@lists.sourceforge.net, linux-net@vger.kernel.org, linux-kernel@vger.kernel.org, netdev@oss.sgi.com, jgarzik@pobox.com Subject: Re: [Bonding-devel] [patch] (2/8) Add 802.3ad support to bonding (released to bonding on sourceforge) From: "David S. Miller" In-Reply-To: References: X-FalunGong: Information control. X-Mailer: Mew version 2.1 on Emacs 21.1 / Mule 5.0 (SAKAKI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) X-archive-position: 2006 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev Content-Length: 830 Lines: 19 From: Jay Vosburgh Date: Thu, 20 Mar 2003 14:53:14 -0800 I have incorporated Shmulik Hen's bug fix patches to bonding (patch numbers 2 and 3) into the current code and released the new patch to sourceforge.net/projects/bonding. The current bonding update is bonding-2.4.20-20030320. The only changes I made were minor spelling / formatting fixes. So when do these changes end up being sent to myself or Jeff for mainline inclusion? I have no objection to the sourceforge project for bonding, but I do object to there being such latency between what the sourceforge tree has (especially bug fixes) and what gets submitted into the mainline. Personally, I'd prefer that all development occur in the mainline tree. That gives you testing coverage that is impossible otherwise. From toml@us.ibm.com Thu Mar 20 16:40:22 2003 Received: with ECARTIS (v1.0.0; list netdev); Thu, 20 Mar 2003 16:40:56 -0800 (PST) Received: from e3.ny.us.ibm.com (e3.ny.us.ibm.com [32.97.182.103]) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h2L0dWq9003027 for ; Thu, 20 Mar 2003 16:40:22 -0800 Received: from northrelay01.pok.ibm.com (northrelay01.pok.ibm.com [9.56.224.149]) by e3.ny.us.ibm.com (8.12.8/8.12.2) with ESMTP id h2L0cskD137186; Thu, 20 Mar 2003 19:38:54 -0500 Received: from tomlt2.austin.ibm.com (tomlt2.austin.ibm.com [9.41.94.20]) by northrelay01.pok.ibm.com (8.12.8/NCO/VER6.5) with ESMTP id h2L0cprD116838; Thu, 20 Mar 2003 19:38:51 -0500 Subject: [PATCH] IPSec: IPV6_IPSEC_POLICY / IPV6_XFRM_POLICY socket options From: Tom Lendacky To: netdev@oss.sgi.com Cc: davem@redhat.com, kuznet@ms2.inr.ac.ru, toml@us.ibm.com Content-Type: text/plain Content-Transfer-Encoding: 7bit X-Mailer: Ximian Evolution 1.0.8 (1.0.8-10) Date: 20 Mar 2003 18:40:16 -0600 Message-Id: <1048207217.1212.3.camel@tomlt2.tomloffice.austin.ibm.com> Mime-Version: 1.0 X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) X-archive-position: 2007 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: toml@us.ibm.com Precedence: bulk X-list: netdev Content-Length: 3165 Lines: 114 I've created a patch to fix the problem of racoon not being able to listen on IPv6 addresses. The problem occurs from not having support for the IP(V6)_IPSEC_POLICY and IP(V6)_XFRM_POLICY socket options in IPv6. Please review the patch below and let me know if my fix is ok. Additionally, for those wanting to run racoon you will have to update the sockmisc.c file. You will need to change the #define of IPV6_IPSEC_POLICY to use the value 34 and not 16 (which is the IP_IPSEC_POLICY value). This will allow racoon to listen on an IPv6 address, but I'm still not having luck getting racoon working over IPv6. Thanks, Tom diff -ur linux-2.5.65-orig/include/linux/in6.h linux-2.5.65/include/linux/in6.h --- linux-2.5.65-orig/include/linux/in6.h 2003-03-17 15:44:11.000000000 -0600 +++ linux-2.5.65/include/linux/in6.h 2003-03-20 10:51:33.000000000 -0600 @@ -176,5 +176,8 @@ #define IPV6_FLOWLABEL_MGR 32 #define IPV6_FLOWINFO_SEND 33 +#define IPV6_IPSEC_POLICY 34 +#define IPV6_XFRM_POLICY 35 + #endif diff -ur linux-2.5.65-orig/net/ipv4/xfrm_user.c linux-2.5.65/net/ipv4/xfrm_user.c --- linux-2.5.65-orig/net/ipv4/xfrm_user.c 2003-03-17 15:44:08.000000000 -0600 +++ linux-2.5.65/net/ipv4/xfrm_user.c 2003-03-20 09:24:53.000000000 -0600 @@ -1080,10 +1080,26 @@ struct xfrm_policy *xp; int nr; - if (opt != IP_XFRM_POLICY) { - *dir = -EOPNOTSUPP; + switch (family) { + case AF_INET: + if (opt != IP_XFRM_POLICY) { + *dir = -EOPNOTSUPP; + return NULL; + } + break; +#if defined(CONFIG_IPV6) || defined(CONFIG_IPV6_MODULE) + case AF_INET6: + if (opt != IPV6_XFRM_POLICY) { + *dir = -EOPNOTSUPP; + return NULL; + } + break; +#endif + default: + *dir = -EINVAL; return NULL; } + *dir = -EINVAL; if (len < sizeof(*p) || diff -ur linux-2.5.65-orig/net/ipv6/ipv6_sockglue.c linux-2.5.65/net/ipv6/ipv6_sockglue.c --- linux-2.5.65-orig/net/ipv6/ipv6_sockglue.c 2003-03-17 15:43:39.000000000 -0600 +++ linux-2.5.65/net/ipv6/ipv6_sockglue.c 2003-03-20 10:07:46.000000000 -0600 @@ -47,6 +47,7 @@ #include #include #include +#include #include @@ -386,6 +387,10 @@ case IPV6_FLOWLABEL_MGR: retv = ipv6_flowlabel_opt(sk, optval, optlen); break; + case IPV6_IPSEC_POLICY: + case IPV6_XFRM_POLICY: + retv = xfrm_user_policy(sk, optname, optval, optlen); + break; #ifdef CONFIG_NETFILTER default: diff -ur linux-2.5.65-orig/net/key/af_key.c linux-2.5.65/net/key/af_key.c --- linux-2.5.65-orig/net/key/af_key.c 2003-03-17 15:43:49.000000000 -0600 +++ linux-2.5.65/net/key/af_key.c 2003-03-20 16:25:10.000000000 -0600 @@ -2415,8 +2415,23 @@ struct xfrm_policy *xp; struct sadb_x_policy *pol = (struct sadb_x_policy*)data; - if (opt != IP_IPSEC_POLICY) { - *dir = -EOPNOTSUPP; + switch (family) { + case AF_INET: + if (opt != IP_IPSEC_POLICY) { + *dir = -EOPNOTSUPP; + return NULL; + } + break; +#if defined(CONFIG_IPV6) || defined(CONFIG_IPV6_MODULE) + case AF_INET6: + if (opt != IPV6_IPSEC_POLICY) { + *dir = -EOPNOTSUPP; + return NULL; + } + break; +#endif + default: + *dir = -EINVAL; return NULL; } From fubar@us.ibm.com Thu Mar 20 16:44:19 2003 Received: with ECARTIS (v1.0.0; list netdev); Thu, 20 Mar 2003 16:44:22 -0800 (PST) Received: from e35.co.us.ibm.com (e35.co.us.ibm.com [32.97.110.133]) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h2L0iHq9003374 for ; Thu, 20 Mar 2003 16:44:18 -0800 Received: from westrelay04.boulder.ibm.com (westrelay04.boulder.ibm.com [9.17.193.32]) by e35.co.us.ibm.com (8.12.8/8.12.2) with ESMTP id h2L0i5gJ021158; Thu, 20 Mar 2003 19:44:05 -0500 Received: from d03nm121.boulder.ibm.com (d03av02.boulder.ibm.com [9.17.193.82]) by westrelay04.boulder.ibm.com (8.12.8/NCO/VER6.5) with ESMTP id h2L0i3sf090166; Thu, 20 Mar 2003 17:44:04 -0700 Importance: Normal Sensitivity: Subject: Re: [Bonding-devel] [patch] (2/8) Add 802.3ad support to bonding (released to bonding on sourceforge) To: "David S. Miller" Cc: hshmulik@intel.com, bonding-devel@lists.sourceforge.net, bonding-announce@lists.sourceforge.net, linux-net@vger.kernel.org, linux-kernel@vger.kernel.org, netdev@oss.sgi.com, jgarzik@pobox.com X-Mailer: Lotus Notes Release 5.0.7 March 21, 2001 Message-ID: From: Jay Vosburgh Date: Thu, 20 Mar 2003 16:43:52 -0800 X-MIMETrack: Serialize by Router on D03NM121/03/M/IBM(Release 6.0 [IBM]|December 16, 2002) at 03/20/2003 17:44:04 MIME-Version: 1.0 Content-type: text/plain; charset=US-ASCII X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) X-archive-position: 2008 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: fubar@us.ibm.com Precedence: bulk X-list: netdev Content-Length: 969 Lines: 31 >So when do these changes end up being sent to myself or >Jeff for mainline inclusion? > >I have no objection to the sourceforge project for bonding, but >I do object to there being such latency between what the sourceforge >tree has (especially bug fixes) and what gets submitted into the >mainline. > >Personally, I'd prefer that all development occur in the mainline >tree. That gives you testing coverage that is impossible otherwise. Fair enough; the delay has gotten excessive of late. Would it be satisfactory going forward for the sourceforge site to contain patches to "standard" releases (e.g., 2.4.20), and do updates to the current development kernel and the sourceforge site simultaneously? In other words, sourceforge has a patch containing all bonding updates since 2.4.20 (or whichever version) was released, and each time that patch is updated, the incremental update goes out for inclusion in the development kernel. -J From jgarzik@pobox.com Thu Mar 20 16:56:45 2003 Received: with ECARTIS (v1.0.0; list netdev); Thu, 20 Mar 2003 16:56:47 -0800 (PST) Received: from www.linux.org.uk (IDENT:exim@parcelfarce.linux.theplanet.co.uk [195.92.249.252]) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h2L0uhq9003760 for ; Thu, 20 Mar 2003 16:56:44 -0800 Received: from rdu57-8-131.nc.rr.com ([66.57.8.131] helo=pobox.com) by www.linux.org.uk with esmtp (Exim 4.14) id 18wAps-0004Ds-Mg; Fri, 21 Mar 2003 00:56:40 +0000 Message-ID: <3E7A635C.3090000@pobox.com> Date: Thu, 20 Mar 2003 19:57:00 -0500 From: Jeff Garzik Organization: none User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.2.1) Gecko/20021213 Debian/1.2.1-2.bunk X-Accept-Language: en MIME-Version: 1.0 To: Jay Vosburgh CC: "David S. Miller" , hshmulik@intel.com, bonding-devel@lists.sourceforge.net, bonding-announce@lists.sourceforge.net, linux-net@vger.kernel.org, linux-kernel@vger.kernel.org, netdev@oss.sgi.com Subject: Re: [Bonding-devel] [patch] (2/8) Add 802.3ad support to bonding (released to bonding on sourceforge) References: In-Reply-To: Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) X-archive-position: 2009 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jgarzik@pobox.com Precedence: bulk X-list: netdev Content-Length: 1206 Lines: 32 Jay Vosburgh wrote: > Fair enough; the delay has gotten excessive of late. > > Would it be satisfactory going forward for the sourceforge site to > contain patches to "standard" releases (e.g., 2.4.20), and do updates to > the current development kernel and the sourceforge site simultaneously? In > other words, sourceforge has a patch containing all bonding updates since > 2.4.20 (or whichever version) was released, and each time that patch is > updated, the incremental update goes out for inclusion in the development > kernel. The ideal situation is for you to send two sets of patches, one for 2.4 tree and one for 2.5 tree. Those will get applied to 2.4.21-pre and 2.5.. Patches against 2.4.20 proper are ok as long as they apply correctly to the latest 2.4.21-pre tree (so, patches against 2.4.21-pre are preferred) If the patches are the same for 2.4 and 2.5, just send one set and note that fact. My preference would be to address these patches To: davem@redhat.com CC: netdev@oss.sgi.com, jgarzik@pobox.com (David, feel free to correct me here, or direct patches to me) When you receive bug fixes, forwarding ASAP would be very much appreciated. Jeff From davem@redhat.com Thu Mar 20 21:52:52 2003 Received: with ECARTIS (v1.0.0; list netdev); Thu, 20 Mar 2003 21:53:03 -0800 (PST) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h2L5qBq9016175 for ; Thu, 20 Mar 2003 21:52:51 -0800 Received: from localhost (IDENT:davem@localhost.localdomain [127.0.0.1]) by pizda.ninka.net (8.9.3/8.9.3) with ESMTP id VAA15131; Thu, 20 Mar 2003 21:50:27 -0800 Date: Thu, 20 Mar 2003 21:50:27 -0800 (PST) Message-Id: <20030320.215027.95912890.davem@redhat.com> To: toml@us.ibm.com Cc: netdev@oss.sgi.com, kuznet@ms2.inr.ac.ru Subject: Re: [PATCH] IPSec: IPV6_IPSEC_POLICY / IPV6_XFRM_POLICY socket options From: "David S. Miller" In-Reply-To: <1048207217.1212.3.camel@tomlt2.tomloffice.austin.ibm.com> References: <1048207217.1212.3.camel@tomlt2.tomloffice.austin.ibm.com> X-FalunGong: Information control. X-Mailer: Mew version 2.1 on Emacs 21.1 / Mule 5.0 (SAKAKI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) X-archive-position: 2010 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev Content-Length: 418 Lines: 11 From: Tom Lendacky Date: 20 Mar 2003 18:40:16 -0600 I've created a patch to fix the problem of racoon not being able to listen on IPv6 addresses. The problem occurs from not having support for the IP(V6)_IPSEC_POLICY and IP(V6)_XFRM_POLICY socket options in IPv6. Please review the patch below and let me know if my fix is ok. This looks find, I will apply it. From larslan@merete.zapto.org Fri Mar 21 00:05:57 2003 Received: with ECARTIS (v1.0.0; list netdev); Fri, 21 Mar 2003 00:06:01 -0800 (PST) Received: from merete.balder.no (197.80-202-160.nextgentel.com [80.202.160.197]) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h2L85Fq9019963 for ; Fri, 21 Mar 2003 00:05:56 -0800 Received: from localhost (larslan@localhost) by merete.balder.no (8.11.6/8.11.6) with ESMTP id h2L7vsI22887; Fri, 21 Mar 2003 08:57:54 +0100 Date: Fri, 21 Mar 2003 08:57:54 +0100 (CET) From: Lars Landmark X-X-Sender: larslan@merete.balder.no To: lartc@mailman.ds9a.nl cc: netdev@oss.sgi.com, Subject: Class crash (linux) Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) X-archive-position: 2011 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: larslan@merete.zapto.org Precedence: bulk X-list: netdev Content-Length: 824 Lines: 26 Hi; I am working width a project intended for developing an adaptive class based QoS system for Linux. I have been struggling width a bug for a long time now. The bug has been tracked to procedure tc_ctl_class() in file linux/net/sched/sch_api.c. In this function there are a switch test, in order to test whether NEWCLASS or other flags are set. If RTM_NEWCLASS flag is set, a if test is performed: if (n->nlmsg_flags&NLM_F_EXCL) What does this if test perform, especially what does NLM_F_EXCL flag represent. When I attach a new class, this test result true. The next step after a goto call, is to test if there are a class pointer. If there are a class pointer, an oops call "oops->put(q, cl)" is done.. In this call my system crash. Is there any approach in order to solve the oops. ?? Lars Landmark Student From jgarzik@pobox.com Fri Mar 21 06:45:48 2003 Received: with ECARTIS (v1.0.0; list netdev); Fri, 21 Mar 2003 06:45:53 -0800 (PST) Received: from www.linux.org.uk (IDENT:exim@parcelfarce.linux.theplanet.co.uk [195.92.249.252]) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h2LEjkq9018748 for ; Fri, 21 Mar 2003 06:45:47 -0800 Received: from rdu57-8-131.nc.rr.com ([66.57.8.131] helo=pobox.com) by www.linux.org.uk with esmtp (Exim 4.14) id 18wNmB-0006x2-Fo for netdev@oss.sgi.com; Fri, 21 Mar 2003 14:45:43 +0000 Message-ID: <3E7B25AC.5030604@pobox.com> Date: Fri, 21 Mar 2003 09:46:04 -0500 From: Jeff Garzik Organization: none User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.2.1) Gecko/20021213 Debian/1.2.1-2.bunk X-Accept-Language: en MIME-Version: 1.0 To: netdev@oss.sgi.com Subject: [Fwd: [E1000] NAPI re-insertion w/ changes] Content-Type: multipart/mixed; boundary="------------010801030101000207060008" X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) X-archive-position: 2012 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jgarzik@pobox.com Precedence: bulk X-list: netdev Content-Length: 7283 Lines: 217 This is a multi-part message in MIME format. --------------010801030101000207060008 Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit For review by the list... The other e100/e1000 changes are in the 2.5 nightly snapshot on kernel.org... --------------010801030101000207060008 Content-Type: message/rfc822; name="[E1000] NAPI re-insertion w/ changes" Content-Transfer-Encoding: 7bit Content-Disposition: inline; filename="[E1000] NAPI re-insertion w/ changes" Return-Path: Delivered-To: garzik@gtf.org Received: from kumquat.pobox.com (kumquat.pobox.com [64.119.218.68]) by havoc.gtf.org (Postfix) with ESMTP id CDD6A6646 for ; Fri, 21 Mar 2003 07:35:11 +0000 (US/Central) Received: from kumquat.pobox.com (localhost.localdomain [127.0.0.1]) by kumquat.pobox.com (Postfix) with ESMTP id E46D759E7A for ; Fri, 21 Mar 2003 02:35:10 -0500 (EST) Delivered-To: jgarzik@pobox.com Received: from vger.kernel.org (vger.kernel.org [209.116.70.75]) by kumquat.pobox.com (Postfix) with ESMTP id AD87C59E7F for ; Fri, 21 Mar 2003 02:35:09 -0500 (EST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id ; Fri, 21 Mar 2003 02:24:04 -0500 Received: (majordomo@vger.kernel.org) by vger.kernel.org id ; Fri, 21 Mar 2003 02:24:04 -0500 Received: from hera.kernel.org ([63.209.29.2]:11673 "EHLO hera.kernel.org") by vger.kernel.org with ESMTP id ; Fri, 21 Mar 2003 02:02:51 -0500 Received: (from dwmw2@localhost) by hera.kernel.org (8.11.6/8.11.6) id h2L7Dn802239 for bk-commits-head@vger.kernel.org; Thu, 20 Mar 2003 23:13:49 -0800 Message-Id: <200303210713.h2L7Dn802239@hera.kernel.org> Subject: [E1000] NAPI re-insertion w/ changes Date: Fri, 21 Mar 2003 04:46:10 +0000 From: Linux Kernel Mailing List To: bk-commits-head@vger.kernel.org X-BK-Repository: hera.kernel.org:/home/dwmw2/BK/linus-2.5 X-BK-ChangeSetKey: cramerj@intel.com|ChangeSet|20030321044610|02292 Sender: bk-commits-head-owner@vger.kernel.org Precedence: bulk X-Mailing-List: bk-commits-head@vger.kernel.org X-Spam-Status: No, hits=0.0 required=5.0 tests=PATCH_UNIFIED_DIFF,SPAM_PHRASE_00_01,TO_BE_REMOVED_REPLY, X_MAILING_LIST version=2.43 X-Spam-Level: ChangeSet 1.1101.22.17, 2003/03/20 23:46:10-05:00, cramerj@intel.com [E1000] NAPI re-insertion w/ changes * Previous patch wiped NAPI support, adding it back here. But, with a twist: this one doesn't disable/enable interrupts each time we enter/leave polling. (It's EXPERIMENTAL). # This patch includes the following deltas: # ChangeSet 1.1101.22.16 -> 1.1101.22.17 # drivers/net/e1000/e1000_main.c 1.60 -> 1.61 # e1000_main.c | 70 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++- 1 files changed, 69 insertions(+), 1 deletion(-) diff -Nru a/drivers/net/e1000/e1000_main.c b/drivers/net/e1000/e1000_main.c --- a/drivers/net/e1000/e1000_main.c Thu Mar 20 23:13:52 2003 +++ b/drivers/net/e1000/e1000_main.c Thu Mar 20 23:13:52 2003 @@ -155,8 +155,14 @@ static inline void e1000_irq_disable(struct e1000_adapter *adapter); static inline void e1000_irq_enable(struct e1000_adapter *adapter); static void e1000_intr(int irq, void *data, struct pt_regs *regs); -static boolean_t e1000_clean_tx_irq(struct e1000_adapter *adapter); +#ifdef CONFIG_E1000_NAPI +static int e1000_clean(struct net_device *netdev, int *budget); +static boolean_t e1000_clean_rx_irq(struct e1000_adapter *adapter, + int *work_done, int work_to_do); +#else static boolean_t e1000_clean_rx_irq(struct e1000_adapter *adapter); +#endif +static boolean_t e1000_clean_tx_irq(struct e1000_adapter *adapter); static void e1000_alloc_rx_buffers(struct e1000_adapter *adapter); static int e1000_ioctl(struct net_device *netdev, struct ifreq *ifr, int cmd); static int e1000_mii_ioctl(struct net_device *netdev, struct ifreq *ifr, @@ -418,6 +424,10 @@ netdev->do_ioctl = &e1000_ioctl; netdev->tx_timeout = &e1000_tx_timeout; netdev->watchdog_timeo = 5 * HZ; +#ifdef CONFIG_E1000_NAPI + netdev->poll = &e1000_clean; + netdev->weight = 64; +#endif netdev->vlan_rx_register = e1000_vlan_rx_register; netdev->vlan_rx_add_vid = e1000_vlan_rx_add_vid; netdev->vlan_rx_kill_vid = e1000_vlan_rx_kill_vid; @@ -1977,7 +1987,9 @@ struct net_device *netdev = data; struct e1000_adapter *adapter = netdev->priv; uint32_t icr = E1000_READ_REG(&adapter->hw, ICR); +#ifndef CONFIG_E1000_NAPI int i; +#endif if(!icr) return; /* Not our interrupt */ @@ -1987,12 +1999,46 @@ mod_timer(&adapter->watchdog_timer, jiffies); } +#ifdef CONFIG_E1000_NAPI + /* Don't disable interrupts - rely on h/w interrupt + * moderation to keep interrupts low. netif_rx_schedule + * is NOP if already polling. */ + netif_rx_schedule(netdev); +#else for(i = 0; i < E1000_MAX_INTR; i++) if(!e1000_clean_rx_irq(adapter) && !e1000_clean_tx_irq(adapter)) break; +#endif +} +#ifdef CONFIG_E1000_NAPI +/** + * e1000_clean - NAPI Rx polling callback + * @adapter: board private structure + **/ + +static int +e1000_clean(struct net_device *netdev, int *budget) +{ + struct e1000_adapter *adapter = netdev->priv; + int work_to_do = min(*budget, netdev->quota); + int work_done = 0; + + while(work_done < work_to_do) + if(!e1000_clean_rx_irq(adapter, &work_done, work_to_do) && + !e1000_clean_tx_irq(adapter)) + break; + + *budget -= work_done; + netdev->quota -= work_done; + + if(work_done < work_to_do) + netif_rx_complete(netdev); + + return (work_done >= work_to_do); } +#endif /** * e1000_clean_tx_irq - Reclaim resources after transmit completes @@ -2054,7 +2100,12 @@ **/ static boolean_t +#ifdef CONFIG_E1000_NAPI +e1000_clean_rx_irq(struct e1000_adapter *adapter, int *work_done, + int work_to_do) +#else e1000_clean_rx_irq(struct e1000_adapter *adapter) +#endif { struct e1000_desc_ring *rx_ring = &adapter->rx_ring; struct net_device *netdev = adapter->netdev; @@ -2071,6 +2122,13 @@ while(rx_desc->status & E1000_RXD_STAT_DD) { +#ifdef CONFIG_E1000_NAPI + if(*work_done >= work_to_do) + break; + + (*work_done)++; +#endif + cleaned = TRUE; pci_unmap_single(pdev, @@ -2133,12 +2191,22 @@ e1000_rx_checksum(adapter, rx_desc, skb); skb->protocol = eth_type_trans(skb, netdev); +#ifdef CONFIG_E1000_NAPI + if(adapter->vlgrp && (rx_desc->status & E1000_RXD_STAT_VP)) { + vlan_hwaccel_receive_skb(skb, adapter->vlgrp, + (rx_desc->special & E1000_RXD_SPC_VLAN_MASK)); + } else { + netif_receive_skb(skb); + } +#else /* CONFIG_E1000_NAPI */ if(adapter->vlgrp && (rx_desc->status & E1000_RXD_STAT_VP)) { vlan_hwaccel_rx(skb, adapter->vlgrp, (rx_desc->special & E1000_RXD_SPC_VLAN_MASK)); } else { netif_rx(skb); } +#endif /* CONFIG_E1000_NAPI */ + netdev->last_rx = jiffies; rx_desc->status = 0; - To unsubscribe from this list: send the line "unsubscribe bk-commits-head" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html --------------010801030101000207060008-- From toml@us.ibm.com Fri Mar 21 06:59:23 2003 Received: with ECARTIS (v1.0.0; list netdev); Fri, 21 Mar 2003 06:59:26 -0800 (PST) Received: from e1.ny.us.ibm.com (e1.ny.us.ibm.com [32.97.182.101]) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h2LEwfq9019168 for ; Fri, 21 Mar 2003 06:59:23 -0800 Received: from northrelay01.pok.ibm.com (northrelay01.pok.ibm.com [9.56.224.149]) by e1.ny.us.ibm.com (8.12.8/8.12.2) with ESMTP id h2LEw1W8035102; Fri, 21 Mar 2003 09:58:01 -0500 Received: from tomlt2.austin.ibm.com (tomlt2.austin.ibm.com [9.41.94.20]) by northrelay01.pok.ibm.com (8.12.8/NCO/VER6.5) with ESMTP id h2LEvx76073454; Fri, 21 Mar 2003 09:57:59 -0500 Subject: [PATCH] IPSec: IPv6 source address not set correctly in xfrm_state From: Tom Lendacky To: netdev@oss.sgi.com Cc: davem@redhat.com, kuznet@ms2.inr.ac.ru, toml@us.ibm.com Content-Type: text/plain Content-Transfer-Encoding: 7bit X-Mailer: Ximian Evolution 1.0.8 (1.0.8-10) Date: 21 Mar 2003 08:59:23 -0600 Message-Id: <1048258764.1214.28.camel@tomlt2.tomloffice.austin.ibm.com> Mime-Version: 1.0 X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) X-archive-position: 2013 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: toml@us.ibm.com Precedence: bulk X-list: netdev Content-Length: 1755 Lines: 41 Here is a patch that fixes the source address in an xfrm_state structure. I found this when the incorrect address was supplied in a pfkey ACQUIRE message. Also, I wasn't able to test the xfrm6_find_acq path, but I believe the memcpy's in that function were not correct and fixed those also. Thanks, Tom diff -ur linux-2.5.65-orig/net/ipv4/xfrm_state.c linux-2.5.65/net/ipv4/xfrm_state.c --- linux-2.5.65-orig/net/ipv4/xfrm_state.c 2003-03-17 15:44:21.000000000 -0600 +++ linux-2.5.65/net/ipv4/xfrm_state.c 2003-03-21 08:48:58.000000000 -0600 @@ -404,7 +404,7 @@ memcpy(&x->id.daddr, daddr, sizeof(x->sel.daddr)); memcpy(&x->props.saddr, &tmpl->saddr, sizeof(x->props.saddr)); if (ipv6_addr_any((struct in6_addr*)&x->props.saddr)) - memcpy(&x->props.saddr, &saddr, sizeof(x->sel.saddr)); + memcpy(&x->props.saddr, saddr, sizeof(x->props.saddr)); x->props.mode = tmpl->mode; x->props.reqid = tmpl->reqid; x->props.family = AF_INET6; @@ -642,13 +642,13 @@ if (x0) { atomic_inc(&x0->refcnt); } else if (create && (x0 = xfrm_state_alloc()) != NULL) { - memcpy(x0->sel.daddr.a6, daddr, sizeof(struct in6_addr)); - memcpy(x0->sel.saddr.a6, saddr, sizeof(struct in6_addr)); + memcpy(&x0->sel.daddr.a6, daddr, sizeof(struct in6_addr)); + memcpy(&x0->sel.saddr.a6, saddr, sizeof(struct in6_addr)); x0->sel.prefixlen_d = 128; x0->sel.prefixlen_s = 128; - memcpy(x0->props.saddr.a6, saddr, sizeof(struct in6_addr)); + memcpy(&x0->props.saddr.a6, saddr, sizeof(struct in6_addr)); x0->km.state = XFRM_STATE_ACQ; - memcpy(x0->id.daddr.a6, daddr, sizeof(struct in6_addr)); + memcpy(&x0->id.daddr.a6, daddr, sizeof(struct in6_addr)); x0->id.proto = proto; x0->props.family = AF_INET6; x0->props.mode = mode; From yoshfuji@wide.ad.jp Fri Mar 21 07:27:12 2003 Received: with ECARTIS (v1.0.0; list netdev); Fri, 21 Mar 2003 07:27:15 -0800 (PST) Received: from yue.hongo.wide.ad.jp (yue.hongo.wide.ad.jp [203.178.139.94]) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h2LFQZq9020303 for ; Fri, 21 Mar 2003 07:27:12 -0800 Received: from localhost (localhost [127.0.0.1]) by yue.hongo.wide.ad.jp (8.12.3+3.5Wbeta/8.12.3/Debian-5) with ESMTP id h2LFQODG012541; Sat, 22 Mar 2003 00:26:25 +0900 Date: Sat, 22 Mar 2003 00:26:24 +0900 (JST) Message-Id: <20030322.002624.32606907.yoshfuji@wide.ad.jp> To: toml@us.ibm.com Cc: netdev@oss.sgi.com, davem@redhat.com, kuznet@ms2.inr.ac.ru Subject: Re: [PATCH] IPSec: IPv6 source address not set correctly in xfrm_state From: YOSHIFUJI Hideaki / =?iso-2022-jp?B?GyRCNUhGIzFRTEAbKEI=?= In-Reply-To: <1048258764.1214.28.camel@tomlt2.tomloffice.austin.ibm.com> References: <1048258764.1214.28.camel@tomlt2.tomloffice.austin.ibm.com> X-URL: http://www.yoshifuji.org/%7Ehideaki/ X-Fingerprint: 90 22 65 EB 1E CF 3A D1 0B DF 80 D8 48 07 F8 94 E0 62 0E EA X-PGP-Key-URL: http://www.yoshifuji.org/%7Ehideaki/hideaki@yoshifuji.org.asc X-Mailer: Mew version 2.2 on Emacs 20.7 / Mule 4.1 (AOI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) X-archive-position: 2014 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: yoshfuji@wide.ad.jp Precedence: bulk X-list: netdev Content-Length: 1422 Lines: 34 Hello! In article <1048258764.1214.28.camel@tomlt2.tomloffice.austin.ibm.com> (at 21 Mar 2003 08:59:23 -0600), Tom Lendacky says: > memcpy(&x->id.daddr, daddr, sizeof(x->sel.daddr)); > memcpy(&x->props.saddr, &tmpl->saddr, sizeof(x->props.saddr)); > if (ipv6_addr_any((struct in6_addr*)&x->props.saddr)) > - memcpy(&x->props.saddr, &saddr, sizeof(x->sel.saddr)); > + memcpy(&x->props.saddr, saddr, sizeof(x->props.saddr)); this fix is ok. however > atomic_inc(&x0->refcnt); > } else if (create && (x0 = xfrm_state_alloc()) != NULL) { > - memcpy(x0->sel.daddr.a6, daddr, sizeof(struct in6_addr)); > - memcpy(x0->sel.saddr.a6, saddr, sizeof(struct in6_addr)); > + memcpy(&x0->sel.daddr.a6, daddr, sizeof(struct in6_addr)); > + memcpy(&x0->sel.saddr.a6, saddr, sizeof(struct in6_addr)); > x0->sel.prefixlen_d = 128; > x0->sel.prefixlen_s = 128; > - memcpy(x0->props.saddr.a6, saddr, sizeof(struct in6_addr)); > + memcpy(&x0->props.saddr.a6, saddr, sizeof(struct in6_addr)); > x0->km.state = XFRM_STATE_ACQ; > - memcpy(x0->id.daddr.a6, daddr, sizeof(struct in6_addr)); > + memcpy(&x0->id.daddr.a6, daddr, sizeof(struct in6_addr)); > x0->id.proto = proto; > x0->props.family = AF_INET6; > x0->props.mode = mode; these are not correct. -- Hideaki YOSHIFUJI @ USAGI Project GPG FP: 9022 65EB 1ECF 3AD1 0BDF 80D8 4807 F894 E062 0EEA From toml@us.ibm.com Fri Mar 21 08:47:33 2003 Received: with ECARTIS (v1.0.0; list netdev); Fri, 21 Mar 2003 08:47:39 -0800 (PST) Received: from e6.ny.us.ibm.com (e6.ny.us.ibm.com [32.97.182.106]) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h2LGlWq9024437 for ; Fri, 21 Mar 2003 08:47:33 -0800 Received: from northrelay03.pok.ibm.com (northrelay03.pok.ibm.com [9.56.224.151]) by e6.ny.us.ibm.com (8.12.8/8.12.2) with ESMTP id h2LGlQMY149278; Fri, 21 Mar 2003 11:47:26 -0500 Received: from d01ml072.pok.ibm.com (d01av01.pok.ibm.com [9.56.224.215]) by northrelay03.pok.ibm.com (8.12.8/NCO/VER6.5) with ESMTP id h2LGlNPs093714; Fri, 21 Mar 2003 11:47:24 -0500 Subject: Re: [PATCH] IPSec: IPv6 source address not set correctly in xfrm_state To: YOSHIFUJI Hideaki / =?iso-2022-jp?B?GyRCNUhGIzFRTEAbKEI=?= Cc: davem@redhat.com, kuznet@ms2.inr.ac.ru, netdev@oss.sgi.com X-Mailer: Lotus Notes Release 5.0.11 July 24, 2002 Message-ID: From: "Tom Lendacky" Date: Fri, 21 Mar 2003 10:47:22 -0600 X-MIMETrack: Serialize by Router on D01ML072/01/M/IBM(Release 5.0.11 +SPRs MIAS5EXFG4, MIAS5AUFPV and DHAG4Y6R7W, MATTEST |November 8th, 2002) at 03/21/2003 11:47:25 AM MIME-Version: 1.0 Content-type: text/plain; charset=us-ascii X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) X-archive-position: 2015 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: toml@us.ibm.com Precedence: bulk X-list: netdev Content-Length: 1418 Lines: 40 > > atomic_inc(&x0->refcnt); > > } else if (create && (x0 = xfrm_state_alloc()) != NULL) { > > - memcpy(x0->sel.daddr.a6, daddr, sizeof(struct in6_addr)); > > - memcpy(x0->sel.saddr.a6, saddr, sizeof(struct in6_addr)); > > + memcpy(&x0->sel.daddr.a6, daddr, sizeof(struct in6_addr)); > > + memcpy(&x0->sel.saddr.a6, saddr, sizeof(struct in6_addr)); > > x0->sel.prefixlen_d = 128; > > x0->sel.prefixlen_s = 128; > > - memcpy(x0->props.saddr.a6, saddr, sizeof(struct in6_addr)); > > + memcpy(&x0->props.saddr.a6, saddr, sizeof(struct in6_addr)); > > x0->km.state = XFRM_STATE_ACQ; > > - memcpy(x0->id.daddr.a6, daddr, sizeof(struct in6_addr)); > > + memcpy(&x0->id.daddr.a6, daddr, sizeof(struct in6_addr)); > > x0->id.proto = proto; > > x0->props.family = AF_INET6; > > x0->props.mode = mode; > these are not correct. Ok, I see, because "a6" is defined as an array. Maybe all of the memcpy's should be the same (include the .a6 or don't include the .a6) so that it is consistent in the code? David, would you like me to resubmit the patch or do you just want to remove the last part of it? Thanks, Tom From garzik@gtf.org Fri Mar 21 09:00:51 2003 Received: with ECARTIS (v1.0.0; list netdev); Fri, 21 Mar 2003 09:00:59 -0800 (PST) Received: from havoc.gtf.org (havoc.daloft.com [64.213.145.173]) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h2LH0nq9024820 for ; Fri, 21 Mar 2003 09:00:50 -0800 Received: by havoc.gtf.org (Postfix, from userid 500) id 288026646; Fri, 21 Mar 2003 17:00:44 +0000 (US/Central) Date: Fri, 21 Mar 2003 12:00:44 -0500 From: Jeff Garzik To: Tom Lendacky Cc: "YOSHIFUJI Hideaki / ?$B5HF#1QL@" , davem@redhat.com, kuznet@ms2.inr.ac.ru, netdev@oss.sgi.com Subject: Re: [PATCH] IPSec: IPv6 source address not set correctly in xfrm_state Message-ID: <20030321170043.GA32417@gtf.org> References: Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.3.28i X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) X-archive-position: 2016 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jgarzik@pobox.com Precedence: bulk X-list: netdev Content-Length: 365 Lines: 12 On Fri, Mar 21, 2003 at 10:47:22AM -0600, Tom Lendacky wrote: > David, would you like me to resubmit the patch or do you just want to > remove the last part of it? Not speaking for David, but in general you should update and resend your patches. David, myself and others get tons of patches and we don't have the time to babysit and hand-edit each one. Jeff From toml@us.ibm.com Fri Mar 21 09:39:12 2003 Received: with ECARTIS (v1.0.0; list netdev); Fri, 21 Mar 2003 09:39:15 -0800 (PST) Received: from e6.ny.us.ibm.com (e6.ny.us.ibm.com [32.97.182.106]) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h2LHd5q9029713 for ; Fri, 21 Mar 2003 09:39:12 -0800 Received: from northrelay04.pok.ibm.com (northrelay04.pok.ibm.com [9.56.224.206]) by e6.ny.us.ibm.com (8.12.8/8.12.2) with ESMTP id h2LHcSMY190046; Fri, 21 Mar 2003 12:38:28 -0500 Received: from tomlt2.austin.ibm.com (tomlt2.austin.ibm.com [9.41.94.20]) by northrelay04.pok.ibm.com (8.12.8/NCO/VER6.5) with ESMTP id h2LHcP1p141206; Fri, 21 Mar 2003 12:38:26 -0500 Subject: [PATCH] (Updated) IPSec: IPv6 source address not set correctly in xfrm_state From: Tom Lendacky To: netdev@oss.sgi.com Cc: davem@redhat.com, kuznet@ms2.inr.ac.ru, toml@us.ibm.com Content-Type: text/plain Content-Transfer-Encoding: 7bit X-Mailer: Ximian Evolution 1.0.8 (1.0.8-10) Date: 21 Mar 2003 11:39:50 -0600 Message-Id: <1048268391.1244.2.camel@tomlt2.tomloffice.austin.ibm.com> Mime-Version: 1.0 X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) X-archive-position: 2018 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: toml@us.ibm.com Precedence: bulk X-list: netdev Content-Length: 612 Lines: 20 Here is the updated patch. Thanks, Tom --- linux-2.5.65-orig/net/ipv4/xfrm_state.c 2003-03-17 15:44:21.000000000 -0600 +++ linux-2.5.65/net/ipv4/xfrm_state.c 2003-03-21 10:51:53.000000000 -0600 @@ -404,7 +404,7 @@ memcpy(&x->id.daddr, daddr, sizeof(x->sel.daddr)); memcpy(&x->props.saddr, &tmpl->saddr, sizeof(x->props.saddr)); if (ipv6_addr_any((struct in6_addr*)&x->props.saddr)) - memcpy(&x->props.saddr, &saddr, sizeof(x->sel.saddr)); + memcpy(&x->props.saddr, saddr, sizeof(x->props.saddr)); x->props.mode = tmpl->mode; x->props.reqid = tmpl->reqid; x->props.family = AF_INET6; From jgrimm2@us.ibm.com Fri Mar 21 16:38:48 2003 Received: with ECARTIS (v1.0.0; list netdev); Fri, 21 Mar 2003 16:38:55 -0800 (PST) Received: from e4.ny.us.ibm.com (e4.ny.us.ibm.com [32.97.182.104]) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h2M0c1q9006300 for ; Fri, 21 Mar 2003 16:38:48 -0800 Received: from northrelay04.pok.ibm.com (northrelay04.pok.ibm.com [9.56.224.206]) by e4.ny.us.ibm.com (8.12.8/8.12.2) with ESMTP id h2M0bthF139364; Fri, 21 Mar 2003 19:37:55 -0500 Received: from us.ibm.com (touki.austin.ibm.com [9.41.94.47]) by northrelay04.pok.ibm.com (8.12.8/NCO/VER6.5) with ESMTP id h2M0br1p136296; Fri, 21 Mar 2003 19:37:53 -0500 Message-ID: <3E7BAC7E.AEC59251@us.ibm.com> Date: Fri, 21 Mar 2003 18:21:18 -0600 From: Jon Grimm X-Mailer: Mozilla 4.79 [en] (X11; U; Linux 2.5.65 i686) X-Accept-Language: en MIME-Version: 1.0 To: "linux-net@vger.kernel.org" , "netdev@oss.sgi.com" Subject: [PATCH] Fix ip6_build_xmit bug Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) X-archive-position: 2019 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jgrimm2@us.ibm.com Precedence: bulk X-list: netdev Content-Length: 1030 Lines: 28 Wanting to play a bit with v6 fragmentation I started using ping6 to send various message sizes. Noticed that messages of sizes just under where fragmentation would kick in, segfaulted in ip6_build_xmit(). Looks like ip6_build_xmit does not allocate room for the dev->hard_header_len on the non-fragmentation path as is done in other places. The hard header len gets reserved even though room was not allocated for it. Consequenetly, the put of the raw data can overflow the skb. Patch below for your consideration. Best Regards, Jon Grimm --- lksctp-2.5/net/ipv6/ip6_output.c Fri Mar 21 17:27:00 2003 +++ lksctp-2.5.work/net/ipv6/ip6_output.c Fri Mar 21 17:24:38 2003 @@ -643,7 +643,8 @@ if (flags&MSG_PROBE) goto out; /* alloc skb with mtu as we do in the IPv4 stack for IPsec */ - skb = sock_alloc_send_skb(sk, mtu, flags & MSG_DONTWAIT, &err); + skb = sock_alloc_send_skb(sk, mtu + dev->hard_header_len + 15, + flags & MSG_DONTWAIT, &err); if (skb == NULL) { IP6_INC_STATS(Ip6OutDiscards); From jgarzik@pobox.com Fri Mar 21 19:28:56 2003 Received: with ECARTIS (v1.0.0; list netdev); Fri, 21 Mar 2003 19:29:01 -0800 (PST) Received: from www.linux.org.uk (IDENT:exim@parcelfarce.linux.theplanet.co.uk [195.92.249.252]) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h2M3Ssq9011267 for ; Fri, 21 Mar 2003 19:28:55 -0800 Received: from rdu57-8-131.nc.rr.com ([66.57.8.131] helo=pobox.com) by www.linux.org.uk with esmtp (Exim 4.14) id 18wF5W-0007gZ-HE; Fri, 21 Mar 2003 05:29:06 +0000 Message-ID: <3E7AA337.5000402@pobox.com> Date: Fri, 21 Mar 2003 00:29:27 -0500 From: Jeff Garzik Organization: none User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.2.1) Gecko/20021213 Debian/1.2.1-2.bunk X-Accept-Language: en MIME-Version: 1.0 To: Linus Torvalds CC: lkml , netdev@oss.sgi.com Subject: [BK PATCH] net driver merges Content-Type: multipart/mixed; boundary="------------090202050203080509050606" X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) X-archive-position: 2020 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jgarzik@pobox.com Precedence: bulk X-list: netdev Content-Length: 8941 Lines: 304 This is a multi-part message in MIME format. --------------090202050203080509050606 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit --------------090202050203080509050606 Content-Type: text/plain; name="net-drivers-2.5.txt" Content-Transfer-Encoding: 7bit Content-Disposition: inline; filename="net-drivers-2.5.txt" Linus, please do a bk pull bk://kernel.bkbits.net/jgarzik/net-drivers-2.5 This will update the following files: drivers/net/e100/e100_vendor.h | 311 ----- Documentation/networking/e100.txt | 9 Documentation/networking/e1000.txt | 132 +- MAINTAINERS | 5 drivers/net/3c509.c | 44 drivers/net/8390.h | 3 drivers/net/Kconfig | 56 + drivers/net/Makefile | 1 drivers/net/Makefile.lib | 1 drivers/net/Space.c | 2 drivers/net/at1700.c | 120 ++ drivers/net/e100/e100.h | 15 drivers/net/e100/e100_config.c | 21 drivers/net/e100/e100_config.h | 4 drivers/net/e100/e100_eeprom.c | 2 drivers/net/e100/e100_main.c | 272 +++-- drivers/net/e100/e100_phy.c | 38 drivers/net/e100/e100_phy.h | 4 drivers/net/e100/e100_test.c | 5 drivers/net/e100/e100_ucode.h | 2 drivers/net/e1000/e1000.h | 32 drivers/net/e1000/e1000_ethtool.c | 104 -- drivers/net/e1000/e1000_hw.c | 1920 ++++++++++++++++++++++++++----------- drivers/net/e1000/e1000_hw.h | 303 +++++ drivers/net/e1000/e1000_main.c | 834 ++++++++++------ drivers/net/e1000/e1000_osdep.h | 24 drivers/net/e1000/e1000_param.c | 85 + drivers/net/ne2k_cbus.c | 879 ++++++++++++++++ drivers/net/ne2k_cbus.h | 481 +++++++++ drivers/net/tg3.c | 8 30 files changed, 4243 insertions(+), 1474 deletions(-) through these ChangeSets: (03/03/21 1.1189) [netdrvr tg3] fix memleak in DMA test Also, bump version to 1.5. Leak fix contributed by Don Fry @ IBM (03/03/20 1.1171.1.17) [E1000] NAPI re-insertion w/ changes * Previous patch wiped NAPI support, adding it back here. But, with a twist: this one doesn't disable/enable interrupts each time we enter/leave polling. (It's EXPERIMENTAL). (03/03/20 1.1171.1.16) [E1000] whitespace fix from previous patches * Corrected indentation from previous patches (03/03/20 1.1171.1.15) [E1000] Controller wake-up thru ASF fix * Fixed controller wake-up through ASF (03/03/20 1.1171.1.14) [E1000] Added Interrupt Throttle Rate tuning support * Added Interrupt Throttle Rate tuning support (03/03/20 1.1171.1.13) [E1000] Added Tx FIFO flush routine * Added method to flush Tx FIFO after link disconnect; the hardware hangs on to Tx skb's that were in flight prior to link loss (03/03/20 1.1171.1.12) [E1000] Whitespace changes * Miscellaneous whitespace changes (03/03/20 1.1171.1.11) [E1000] Compaq to HP branding change * Changed "Compaq" branding to "HP" (03/03/20 1.1171.1.10) [E1000] Read/Write register macro optimizations * Optimized E1000_*_REG macros (03/03/20 1.1171.1.9) [E1000] Tx Descriptor cleanup * Completely clean Tx descriptor to avoid potential dirty descriptor fetching (rare, but possible) (03/03/20 1.1171.1.8) [E1000] Perform single PCI read per interrupt * ISR cleanup; performing single PCI read (03/03/20 1.1171.1.7) [E1000] Modulus math removed * Removed modulus math; decreases CPU utilization, especially on PPC64 [anton@samba.org] (03/03/20 1.1171.1.6) [E1000] Added MII support * Added MII support (03/03/20 1.1171.1.5) [E1000] Added 82541 & 82547 support * Added support for 82541 and 82547 gigabit ethernet adapters (03/03/20 1.1171.1.4) [E1000] IRQ registration fix * Fixed IRQ registration bug; IRQ now registered after resources are acquired (03/03/20 1.1171.1.3) [E1000] Spd/dplx abstraction; eeprom size changes * Setting speed/duplex is now it's own routine * Update ETHTOOL_GEEPROM routine to use new eeprom size variable (03/03/20 1.1171.1.2) [E1000] Version, copyright, changelog and MAINTAINERS * Version, copyright, changelog and MAINTAINERS updates (03/03/20 1.1171.1.1) [E1000] Documentation/networking/e1000.txt updates * Documentation/networking/e1000.txt updates (03/03/20 1.1187) [PATCH] Support PC-9800 subarchitecture (9/14) NIC This is the patch to support NEC PC-9800 subarchitecture against 2.5.65-ac1. (9/14) C-bus(PC98's legacy bus like ISA) network cards support. Change IO port and IRQ assign. Add NE2000 compatible driver for PC-9800. PCI netwwork card works fine without patch. Regards, Osamu Tomita (03/03/20 1.1186) [E100] ASF wakeup enabled, but only if set in EEPROM On Thu, 20 Mar 2003, Scott Feldman wrote: * Check if ASF is enabled in EEPROM, and if so, enable PME wakeup when suspending. (03/03/20 1.1185) [E100] ethtool EEPROM and GSTRING fixes On Thu, 20 Mar 2003, Scott Feldman wrote: * Bug fix: read wrong byte in EEPROM when offset is odd number * Bug fix: memory leak in ETHTOOL_GSTRINGS [Oleg Drokin (03/03/20 1.1184) [E100] Validate updates to MAC address On Thu, 20 Mar 2003, Scott Feldman wrote: * Validate updates to MAC address as valid ethernet address. (03/03/20 1.1183) [E100] interrupt handler free fix On Thu, 20 Mar 2003, Scott Feldman wrote: * Bug fix on e100_close when repeating hot remove/hot add from team. Basically need to disable interurpts and unregister handler before shutting h/w down. * Need to mask only the relevant bits in the interrupt status register (03/03/20 1.1182) [E100] Honor WOL settings in EEPROM On Thu, 20 Mar 2003, Scott Feldman wrote: * Honor WOL settings in EEPROM: only advertise WOL magic packet if in EEPROM. (03/03/20 1.1181) [E100] ICH5 support added On Thu, 20 Mar 2003, Scott Feldman wrote: * ICH5 support: chipset integrated LAN (8255x) * PHY loopback diags is broken on all ICHs (03/03/20 1.1180) [E100] forced speed/duplex link recover On Thu, 20 Mar 2003, Scott Feldman wrote: * Bug fix when changing to non-autoneg, device may lose link with some switches, so try to recover link by forcing PHY. (03/03/20 1.1179) [E100] Banish strong branding marketing strings On Thu, 20 Mar 2003, Scott Feldman wrote: * Get rid of all of the strong marketing brand strings and replace with simple pci_device_id table. pci.ids should be the master list for device ID/strings. (03/03/20 1.1178) [E100] Bug fix on setting up Tx csum On Thu, 20 Mar 2003, Scott Feldman wrote: * Bug fix on setting up Tx csum (03/03/20 1.1177) [E100] Clean up #include order On Thu, 20 Mar 2003, Scott Feldman wrote: * clean up #includes (03/03/20 1.1176) [E100] Add support for VLAN hw offload On Thu, 20 Mar 2003, Scott Feldman wrote: * Add support for VLAN hw offload (03/03/20 1.1175) [E100] Spelling mistakes On Thu, 20 Mar 2003, Scott Feldman wrote: * Spelling mistakes (03/03/20 1.1174) [E100] update version, copyright year, changelog On Thu, 20 Mar 2003, Scott Feldman wrote: * Update version, copyright year, changelog (03/03/20 1.1173) [E100] Update Documentation/networking/e100.txt On Thu, 20 Mar 2003, Scott Feldman wrote: * Update Documentation/networking/e100.txt (03/03/20 1.1172) [E100] back out memleak patch cuz it messed up following On Thu, 20 Mar 2003, Scott Feldman wrote: * Back this patch out - we'll add it later. I was working against 2.5.64 when this was checked into 2.5.65, so it messed up my patches. --------------090202050203080509050606-- From nivedita@w-nivedita.beaverton.ibm.com.sgi.com Fri Mar 21 19:57:12 2003 Received: with ECARTIS (v1.0.0; list netdev); Fri, 21 Mar 2003 19:57:20 -0800 (PST) Received: from w-nivedita.beaverton.ibm.com (bi01p1.co.us.ibm.com [32.97.110.142]) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h2M3vAq9011724 for ; Fri, 21 Mar 2003 19:57:11 -0800 Received: from localhost (localhost [[UNIX: localhost]]) by w-nivedita.beaverton.ibm.com (8.11.6/8.11.0) id h2M3upG30448; Fri, 21 Mar 2003 19:56:51 -0800 Content-Type: text/plain; charset="us-ascii" From: Nivedita Singhvi To: kuznet@ms2.inr.ac.ru, davem@redhat.com Subject: Patch: minor nit in ip_options_compile() Date: Fri, 21 Mar 2003 19:56:50 -0800 User-Agent: KMail/1.4.1 Cc: netdev@oss.sgi.com MIME-Version: 1.0 Message-Id: <200303211956.50921.niv@us.ibm.com> Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by oss.sgi.com id h2M3vAq9011724 X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) X-archive-position: 2021 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: niv@us.ibm.com Precedence: bulk X-list: netdev Content-Length: 1013 Lines: 35 In the following else clause, we check for opt->is_data, which should always be set for this case, and if not, current code will lead to a null ptr dereference because skb is always null in this case.. Figured its better to fall down to returning EINVAL.. Look reasonable? thanks, Nivedita --- /usr/src/linux-2.5.65/net/ipv4/ip_options.c Mon Mar 17 13:44:21 2003 +++ /usr/src/linux-2.5.65ref1/net/ipv4/ip_options.c Fri Mar 21 18:16:05 2003 @@ -245,7 +245,7 @@ int ip_options_compile(struct ip_options * opt, struct sk_buff * skb) { int l; - unsigned char * iph; + unsigned char * iph = NULL; unsigned char * optptr; int optlen; unsigned char * pp_ptr = NULL; @@ -259,7 +259,9 @@ optptr = iph + sizeof(struct iphdr); opt->is_data = 0; } else { - optptr = opt->is_data ? opt->__data : (unsigned char*)&(skb->nh.iph[1]); + /* Only caller here is ip_options_get(), sets up opt, no skb */ + if ((optptr = opt->__data) == 0) + goto error; iph = optptr - sizeof(struct iphdr); } From Robert.Olsson@data.slu.se Sat Mar 22 07:34:17 2003 Received: with ECARTIS (v1.0.0; list netdev); Sat, 22 Mar 2003 07:34:20 -0800 (PST) Received: from robur.slu.se (robur.slu.se [130.238.98.12]) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h2MFYEq9029074 for ; Sat, 22 Mar 2003 07:34:17 -0800 Received: (from robert@localhost) by robur.slu.se (8.9.3/8.9.3) id QAA12348; Sat, 22 Mar 2003 16:34:02 +0100 From: Robert Olsson MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Message-ID: <15996.33385.784192.430037@robur.slu.se> Date: Sat, 22 Mar 2003 16:34:01 +0100 To: Jeff Garzik Cc: netdev@oss.sgi.com Subject: [Fwd: [E1000] NAPI re-insertion w/ changes] In-Reply-To: <3E7B25AC.5030604@pobox.com> References: <3E7B25AC.5030604@pobox.com> X-Mailer: VM 6.92 under Emacs 19.34.1 X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) X-archive-position: 2024 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: Robert.Olsson@data.slu.se Precedence: bulk X-list: netdev Content-Length: 1179 Lines: 37 Jeff Garzik writes: > For review by the list... The other e100/e1000 changes are in the 2.5 > +#ifdef CONFIG_E1000_NAPI > + /* Don't disable interrupts - rely on h/w interrupt > + * moderation to keep interrupts low. netif_rx_schedule > + * is NOP if already polling. */ > + netif_rx_schedule(netdev); > +#else It's clean but I have some concerns... I think this will add interrupts when resources are fully utilized. In other words a decrease in top performance. I say "think" because I have no numbers. At GIGE rate we have ~1 k interrupts/sec using interrupt delay. (it depend of ring sizes etc). We are now seeing Linux boxes with ~10 GIGE interfaces. So any effects gets multiplied. It makes the use zero latency RX complicated. We see Ethernet getting used for "new" applications as SCSI, filesystems etc. In current e1000 we just can set the desired interrupt delay and relax. If/when PCI-X uses message signalled interrupts (MSI) we have this "un- necessary" load over PCI too with bus arbitrations etc. IMO believe your old plan having e1000 irq disable and with mitigation as default feels better but testing is needed. Cheers. --ro From yoshfuji@linux-ipv6.org Sat Mar 22 08:35:42 2003 Received: with ECARTIS (v1.0.0; list netdev); Sat, 22 Mar 2003 08:35:45 -0800 (PST) Received: from yue.hongo.wide.ad.jp (yue.hongo.wide.ad.jp [203.178.139.94]) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h2MGZfq9029779 for ; Sat, 22 Mar 2003 08:35:42 -0800 Received: from localhost (localhost [127.0.0.1]) by yue.hongo.wide.ad.jp (8.12.3+3.5Wbeta/8.12.3/Debian-5) with ESMTP id h2MGZSDG029604; Sun, 23 Mar 2003 01:35:28 +0900 Date: Sun, 23 Mar 2003 01:35:28 +0900 (JST) Message-Id: <20030323.013528.19572208.yoshfuji@linux-ipv6.org> To: linux-kernel@vger.kernel.org, netdev@oss.sgi.com CC: davem@redhat.com, kuznet@ms2.inr.ac.ru, usagi@linux-ipv6.org Subject: [PATCH] IPv6: use "const" qualifier From: YOSHIFUJI Hideaki / =?iso-2022-jp?B?GyRCNUhGIzFRTEAbKEI=?= Organization: USAGI Project X-URL: http://www.yoshifuji.org/%7Ehideaki/ X-Fingerprint: 90 22 65 EB 1E CF 3A D1 0B DF 80 D8 48 07 F8 94 E0 62 0E EA X-PGP-Key-URL: http://www.yoshifuji.org/%7Ehideaki/hideaki@yoshifuji.org.asc X-Face: "5$Al-.M>NJ%a'@hhZdQm:."qn~PA^gq4o*>iCFToq*bAi#4FRtx}enhuQKz7fNqQz\BYU] $~O_5m-9'}MIs`XGwIEscw;e5b>n"B_?j/AkL~i/MEaZBLP X-Mailer: Mew version 2.2 on Emacs 20.7 / Mule 4.1 (AOI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) X-archive-position: 2027 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: yoshfuji@linux-ipv6.org Precedence: bulk X-list: netdev Content-Length: 4130 Lines: 120 Hello. Specify some arguments of IPv6 address manipulation / testing functions "const" qualifier. Patch is against linux-2.5.64 + ChangeSet 1.1188. This should be suitable for linux-2.4.x. Thanks in advance. Index: include/net/addrconf.h =================================================================== RCS file: /cvsroot/usagi/usagi-backport/linux25/include/net/addrconf.h,v retrieving revision 1.1.1.4 retrieving revision 1.1.1.4.4.1 diff -u -r1.1.1.4 -r1.1.1.4.4.1 --- include/net/addrconf.h 22 Mar 2003 01:52:43 -0000 1.1.1.4 +++ include/net/addrconf.h 22 Mar 2003 15:05:07 -0000 1.1.1.4.4.1 @@ -175,7 +175,7 @@ * Hash function taken from net_alias.c */ -static __inline__ u8 ipv6_addr_hash(struct in6_addr *addr) +static __inline__ u8 ipv6_addr_hash(const struct in6_addr *addr) { __u32 word; @@ -195,7 +195,7 @@ * compute link-local solicited-node multicast address */ -static inline void addrconf_addr_solict_mult(struct in6_addr *addr, +static inline void addrconf_addr_solict_mult(const struct in6_addr *addr, struct in6_addr *solicited) { ipv6_addr_set(solicited, @@ -219,7 +219,7 @@ __constant_htonl(0x2)); } -static inline int ipv6_addr_is_multicast(struct in6_addr *addr) +static inline int ipv6_addr_is_multicast(const struct in6_addr *addr) { return (addr->s6_addr32[0] & __constant_htonl(0xFF000000)) == __constant_htonl(0xFF000000); } Index: include/net/ipv6.h =================================================================== RCS file: /cvsroot/usagi/usagi-backport/linux25/include/net/ipv6.h,v retrieving revision 1.1.1.4 retrieving revision 1.1.1.4.30.1 diff -u -r1.1.1.4 -r1.1.1.4.30.1 --- include/net/ipv6.h 9 Jan 2003 11:14:19 -0000 1.1.1.4 +++ include/net/ipv6.h 22 Mar 2003 14:56:24 -0000 1.1.1.4.30.1 @@ -226,21 +226,21 @@ unsigned int, unsigned int); -extern int ipv6_addr_type(struct in6_addr *addr); +extern int ipv6_addr_type(const struct in6_addr *addr); -static inline int ipv6_addr_scope(struct in6_addr *addr) +static inline int ipv6_addr_scope(const struct in6_addr *addr) { return ipv6_addr_type(addr) & IPV6_ADDR_SCOPE_MASK; } -static inline int ipv6_addr_cmp(struct in6_addr *a1, struct in6_addr *a2) +static inline int ipv6_addr_cmp(const struct in6_addr *a1, const struct in6_addr *a2) { - return memcmp((void *) a1, (void *) a2, sizeof(struct in6_addr)); + return memcmp((const void *) a1, (const void *) a2, sizeof(struct in6_addr)); } -static inline void ipv6_addr_copy(struct in6_addr *a1, struct in6_addr *a2) +static inline void ipv6_addr_copy(struct in6_addr *a1, const struct in6_addr *a2) { - memcpy((void *) a1, (void *) a2, sizeof(struct in6_addr)); + memcpy((void *) a1, (const void *) a2, sizeof(struct in6_addr)); } #ifndef __HAVE_ARCH_ADDR_SET @@ -255,7 +255,7 @@ } #endif -static inline int ipv6_addr_any(struct in6_addr *a) +static inline int ipv6_addr_any(const struct in6_addr *a) { return ((a->s6_addr32[0] | a->s6_addr32[1] | a->s6_addr32[2] | a->s6_addr32[3] ) == 0); Index: net/ipv6/addrconf.c =================================================================== RCS file: /cvsroot/usagi/usagi-backport/linux25/net/ipv6/addrconf.c,v retrieving revision 1.1.1.8 retrieving revision 1.1.1.8.4.2 diff -u -r1.1.1.8 -r1.1.1.8.4.2 --- net/ipv6/addrconf.c 22 Mar 2003 01:52:23 -0000 1.1.1.8 +++ net/ipv6/addrconf.c 22 Mar 2003 15:01:28 -0000 1.1.1.8.4.2 @@ -172,7 +172,7 @@ const struct in6_addr in6addr_any = IN6ADDR_ANY_INIT; const struct in6_addr in6addr_loopback = IN6ADDR_LOOPBACK_INIT; -int ipv6_addr_type(struct in6_addr *addr) +int ipv6_addr_type(const struct in6_addr *addr) { int type; u32 st; @@ -486,7 +486,7 @@ /* On success it returns ifp with increased reference count */ static struct inet6_ifaddr * -ipv6_add_addr(struct inet6_dev *idev, struct in6_addr *addr, int pfxlen, +ipv6_add_addr(struct inet6_dev *idev, const struct in6_addr *addr, int pfxlen, int scope, unsigned flags) { struct inet6_ifaddr *ifa; -- Hideaki YOSHIFUJI @ USAGI Project GPG FP: 9022 65EB 1ECF 3AD1 0BDF 80D8 4807 F894 E062 0EEA From yoshfuji@linux-ipv6.org Sat Mar 22 08:35:36 2003 Received: with ECARTIS (v1.0.0; list netdev); Sat, 22 Mar 2003 08:35:40 -0800 (PST) Received: from yue.hongo.wide.ad.jp (yue.hongo.wide.ad.jp [203.178.139.94]) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h2MGZYq9029767 for ; Sat, 22 Mar 2003 08:35:35 -0800 Received: from localhost (localhost [127.0.0.1]) by yue.hongo.wide.ad.jp (8.12.3+3.5Wbeta/8.12.3/Debian-5) with ESMTP id h2MGZWDG029607; Sun, 23 Mar 2003 01:35:32 +0900 Date: Sun, 23 Mar 2003 01:35:32 +0900 (JST) Message-Id: <20030323.013532.24422763.yoshfuji@linux-ipv6.org> To: linux-kernel@vger.kernel.org, netdev@oss.sgi.com CC: davem@redhat.com, kuznet@ms2.inr.ac.ru, usagi@linux-ipv6.org Subject: [PATCH] IPv6: use RFC2553 constant From: YOSHIFUJI Hideaki / =?iso-2022-jp?B?GyRCNUhGIzFRTEAbKEI=?= Organization: USAGI Project X-URL: http://www.yoshifuji.org/%7Ehideaki/ X-Fingerprint: 90 22 65 EB 1E CF 3A D1 0B DF 80 D8 48 07 F8 94 E0 62 0E EA X-PGP-Key-URL: http://www.yoshifuji.org/%7Ehideaki/hideaki@yoshifuji.org.asc X-Face: "5$Al-.M>NJ%a'@hhZdQm:."qn~PA^gq4o*>iCFToq*bAi#4FRtx}enhuQKz7fNqQz\BYU] $~O_5m-9'}MIs`XGwIEscw;e5b>n"B_?j/AkL~i/MEaZBLP X-Mailer: Mew version 2.2 on Emacs 20.7 / Mule 4.1 (AOI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) X-archive-position: 2025 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: yoshfuji@linux-ipv6.org Precedence: bulk X-list: netdev Content-Length: 1289 Lines: 46 Hello. Use RFC2553 constant variable. Patch is for linux-2.5.65 + ChangeSet 1.1188 and depends on my use "const" qualifier patch. Thanks in advance. Index: net/ipv6/addrconf.c =================================================================== RCS file: /cvsroot/usagi/usagi-backport/linux25/net/ipv6/addrconf.c,v retrieving revision 1.1.1.8.4.2 retrieving revision 1.1.1.8.4.3 diff -u -r1.1.1.8.4.2 -r1.1.1.8.4.3 --- net/ipv6/addrconf.c 22 Mar 2003 15:01:28 -0000 1.1.1.8.4.2 +++ net/ipv6/addrconf.c 22 Mar 2003 15:16:50 -0000 1.1.1.8.4.3 @@ -1646,7 +1646,6 @@ static void init_loopback(struct net_device *dev) { - struct in6_addr addr; struct inet6_dev *idev; struct inet6_ifaddr * ifp; @@ -1654,15 +1653,12 @@ ASSERT_RTNL(); - memset(&addr, 0, sizeof(struct in6_addr)); - addr.s6_addr[15] = 1; - if ((idev = ipv6_find_idev(dev)) == NULL) { printk(KERN_DEBUG "init loopback: add_dev failed\n"); return; } - ifp = ipv6_add_addr(idev, &addr, 128, IFA_HOST, IFA_F_PERMANENT); + ifp = ipv6_add_addr(idev, &in6addr_loopback, 128, IFA_HOST, IFA_F_PERMANENT); if (ifp) { spin_lock_bh(&ifp->lock); ifp->flags &= ~IFA_F_TENTATIVE; -- Hideaki YOSHIFUJI @ USAGI Project GPG FP: 9022 65EB 1ECF 3AD1 0BDF 80D8 4807 F894 E062 0EEA From yoshfuji@linux-ipv6.org Sat Mar 22 08:35:37 2003 Received: with ECARTIS (v1.0.0; list netdev); Sat, 22 Mar 2003 08:35:41 -0800 (PST) Received: from yue.hongo.wide.ad.jp (yue.hongo.wide.ad.jp [203.178.139.94]) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h2MGZaq9029772 for ; Sat, 22 Mar 2003 08:35:37 -0800 Received: from localhost (localhost [127.0.0.1]) by yue.hongo.wide.ad.jp (8.12.3+3.5Wbeta/8.12.3/Debian-5) with ESMTP id h2MGZZDG029610; Sun, 23 Mar 2003 01:35:35 +0900 Date: Sun, 23 Mar 2003 01:35:35 +0900 (JST) Message-Id: <20030323.013535.60875023.yoshfuji@linux-ipv6.org> To: linux-kernel@vger.kernel.org, netdev@oss.sgi.com CC: davem@redhat.com, kuznet@ms2.inr.ac.ru, usagi@linux-ipv6.org Subject: [PATCH] IPv6: use ipv6_addr_any() for testing unspecified address From: YOSHIFUJI Hideaki / =?iso-2022-jp?B?GyRCNUhGIzFRTEAbKEI=?= Organization: USAGI Project X-URL: http://www.yoshifuji.org/%7Ehideaki/ X-Fingerprint: 90 22 65 EB 1E CF 3A D1 0B DF 80 D8 48 07 F8 94 E0 62 0E EA X-PGP-Key-URL: http://www.yoshifuji.org/%7Ehideaki/hideaki@yoshifuji.org.asc X-Face: "5$Al-.M>NJ%a'@hhZdQm:."qn~PA^gq4o*>iCFToq*bAi#4FRtx}enhuQKz7fNqQz\BYU] $~O_5m-9'}MIs`XGwIEscw;e5b>n"B_?j/AkL~i/MEaZBLP X-Mailer: Mew version 2.2 on Emacs 20.7 / Mule 4.1 (AOI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) X-archive-position: 2026 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: yoshfuji@linux-ipv6.org Precedence: bulk X-list: netdev Content-Length: 1656 Lines: 50 Hello. Use ipv6_addr_any() for testing unspecified address. Patch is for linux-2.5.65 + ChangeSet 1.1188. This should be suitable for linux-2.4.x. Thanks in advance. Index: net/ipv6/addrconf.c =================================================================== RCS file: /cvsroot/usagi/usagi-backport/linux25/net/ipv6/addrconf.c,v retrieving revision 1.1.1.8.4.3 retrieving revision 1.1.1.8.4.4 diff -u -r1.1.1.8.4.3 -r1.1.1.8.4.4 --- net/ipv6/addrconf.c 22 Mar 2003 15:16:50 -0000 1.1.1.8.4.3 +++ net/ipv6/addrconf.c 22 Mar 2003 15:27:05 -0000 1.1.1.8.4.4 @@ -426,8 +426,7 @@ } for (ifa=idev->addr_list; ifa; ifa=ifa->if_next) { ipv6_addr_prefix(&addr, &ifa->addr, ifa->prefix_len); - if (addr.s6_addr32[0] == 0 && addr.s6_addr32[1] == 0 && - addr.s6_addr32[2] == 0 && addr.s6_addr32[3] == 0) + if (ipv6_addr_any(&addr)) continue; if (idev->cnf.forwarding) ipv6_dev_ac_inc(idev->dev, &addr); @@ -2030,8 +2029,7 @@ struct in6_addr addr; ipv6_addr_prefix(&addr, &ifp->addr, ifp->prefix_len); - if (addr.s6_addr32[0] || addr.s6_addr32[1] || - addr.s6_addr32[2] || addr.s6_addr32[3]) + if (!ipv6_addr_any(&addr)) ipv6_dev_ac_inc(ifp->idev->dev, &addr); } } @@ -2368,8 +2366,7 @@ struct in6_addr addr; ipv6_addr_prefix(&addr, &ifp->addr, ifp->prefix_len); - if (addr.s6_addr32[0] || addr.s6_addr32[1] || - addr.s6_addr32[2] || addr.s6_addr32[3]) + if (!ipv6_addr_any(&addr)) ipv6_dev_ac_dec(ifp->idev->dev, &addr); } if (!ipv6_chk_addr(&ifp->addr, NULL)) -- Hideaki YOSHIFUJI @ USAGI Project GPG FP: 9022 65EB 1ECF 3AD1 0BDF 80D8 4807 F894 E062 0EEA From scott.feldman@intel.com Sat Mar 22 10:48:03 2003 Received: with ECARTIS (v1.0.0; list netdev); Sat, 22 Mar 2003 10:48:06 -0800 (PST) Received: from hermes.fm.intel.com (fmr01.intel.com [192.55.52.18]) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h2MIlNq9031601 for ; Sat, 22 Mar 2003 10:48:03 -0800 Received: from talaria.fm.intel.com (talaria.fm.intel.com [10.1.192.39]) by hermes.fm.intel.com (8.11.6/8.11.6/d: outer.mc,v 1.51 2002/09/23 20:43:23 dmccart Exp $) with ESMTP id h2MIhr128650 for ; Sat, 22 Mar 2003 18:43:53 GMT Received: from fmsmsxvs041.fm.intel.com (fmsmsxvs041.fm.intel.com [132.233.42.126]) by talaria.fm.intel.com (8.11.6/8.11.6/d: inner.mc,v 1.28 2003/01/13 19:44:39 dmccart Exp $) with SMTP id h2MImjd06000 for ; Sat, 22 Mar 2003 18:48:45 GMT Received: from fmsmsx28.fm.intel.com ([132.233.42.28]) by fmsmsxvs041.fm.intel.com (NAVGW 2.5.2.11) with SMTP id M2003032210450032597 ; Sat, 22 Mar 2003 10:45:00 -0800 Received: by fmsmsx28.fm.intel.com with Internet Mail Service (5.5.2653.19) id ; Sat, 22 Mar 2003 10:47:21 -0800 Message-ID: From: "Feldman, Scott" To: Robert Olsson , Jeff Garzik Cc: netdev@oss.sgi.com Subject: RE: [Fwd: [E1000] NAPI re-insertion w/ changes] Date: Sat, 22 Mar 2003 10:47:18 -0800 MIME-Version: 1.0 X-Mailer: Internet Mail Service (5.5.2653.19) content-class: urn:content-classes:message Content-Type: text/plain; charset="ISO-8859-1" X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) X-archive-position: 2028 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: scott.feldman@intel.com Precedence: bulk X-list: netdev Content-Length: 1410 Lines: 38 > It's clean but I have some concerns... Thanks for the feedback. It's a twist on the previous driver where we disabled/enabled interrupts each time we went in/out of polling. Trying to avoid those extra PCI writes. My experience is that you have to really load up the interface to stay in polling mode (get up on step). > I think this will add interrupts when resources are fully > utilized. In other words a decrease in top performance. I say > "think" because I have > no numbers. > > At GIGE rate we have ~1 k interrupts/sec using interrupt > delay. (it depend > of ring sizes etc). We are now seeing Linux boxes with ~10 > GIGE interfaces. > So any effects gets multiplied. Should be the same interrupt rate with or without NAPI. > It makes the use zero latency RX complicated. We see Ethernet > getting used for "new" applications as SCSI, filesystems etc. > In current e1000 > we just can set the desired interrupt delay and relax. > > If/when PCI-X uses message signalled interrupts (MSI) we have > this "un- necessary" load over PCI too with bus arbitrations etc. > > > IMO believe your old plan having e1000 irq disable and with > mitigation as > default feels better but testing is needed. Easy enough to revert back. I don't think we've lost any of the non-perf benefits of NAPI, and if testing shows no meaningful perf difference, let's let Occam's razor rule. -scott From ahu@outpost.ds9a.nl Sat Mar 22 11:26:47 2003 Received: with ECARTIS (v1.0.0; list netdev); Sat, 22 Mar 2003 11:26:55 -0800 (PST) Received: from outpost.ds9a.nl (postfix@outpost.ds9a.nl [213.244.168.210]) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h2MJQ6q9032154 for ; Sat, 22 Mar 2003 11:26:46 -0800 Received: by outpost.ds9a.nl (Postfix, from userid 1000) id C766745C4; Sat, 22 Mar 2003 20:26:04 +0100 (CET) Date: Sat, 22 Mar 2003 20:26:04 +0100 From: bert hubert To: Jon Grimm Cc: "linux-net@vger.kernel.org" , "netdev@oss.sgi.com" Subject: ip6sec MTU/fragmentation issue / Was: Re: [PATCH] Fix ip6_build_xmit bug Message-ID: <20030322192604.GA3011@outpost.ds9a.nl> Mail-Followup-To: bert hubert , Jon Grimm , "linux-net@vger.kernel.org" , "netdev@oss.sgi.com" References: <3E7BAC7E.AEC59251@us.ibm.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <3E7BAC7E.AEC59251@us.ibm.com> User-Agent: Mutt/1.3.28i X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) X-archive-position: 2029 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: ahu@ds9a.nl Precedence: bulk X-list: netdev Content-Length: 2211 Lines: 57 On Fri, Mar 21, 2003 at 06:21:18PM -0600, Jon Grimm wrote: > Wanting to play a bit with v6 fragmentation I started using ping6 to > send various message sizes. Noticed that messages of sizes just under > where fragmentation would kick in, segfaulted in ip6_build_xmit(). Thanks, this fixes an observed issue here with segfaults in ping6, as you described. I run 2.5.65. There is another problem with ip6sec however where fragmentation fails. Setting up an ip6sec connection and then sending bulk data freezes up a connection. ping6 -s 1500 deef.ds9a.nl -n leads to: 20:16:18.125073 2001:888:1036:0:2a0:ccff:fec8:f25c > 2001:888:1036:0:2e0:18ff:fe23:cece: AH(spi=0x00003d54,sumlen=16,seq=0xbe): ESP(spi=0x00003d55,seq=0xbe) (len 1504, hlim 64) 20:16:18.125129 2001:888:1036:0:2a0:ccff:fec8:f25c > 2001:888:1036:0:2e0:18ff:fe23:cece: AH(spi=0x00003d54,sumlen=16,seq=0xbf): ESP(spi=0x00003d55,seq=0xbf) (len 112, hlim 64) and a reply: 20:16:18.125474 2001:888:1036:0:2e0:18ff:fe23:cece > 2001:888:1036:0:2a0:ccff:fec8:f25c: AH(spi=0x00005fb4,sumlen=16,seq=0x82): ESP(spi=0x00005fb5,seq=0x82) [hlim 0] (len 160) The reply appears to be a bit short and is possibly an ICMP error. When I configure ip6sec only in one way, I get this reply to fragmented ICMP echo requests: 20:22:24.445157 2001:888:1036:0:2e0:18ff:fe23:cece > 2001:888:1036:0:2a0:ccff:fec8:f25c: icmp6: parameter problem next header - octet 6 (len 116, hlim 64) This is probably the same packet as we see encrypted above. Working ping6, -s 1400, looks like this: 20:18:56.820699 2001:888:1036:0:2a0:ccff:fec8:f25c > 2001:888:1036:0:2e0:18ff:fe23:cece: AH(spi=0x00003d54,sumlen=16,seq=0x142): ESP(spi=0x00003d55,seq=0x142) (len 1456, hlim 64) 20:18:56.821912 2001:888:1036:0:2e0:18ff:fe23:cece > 2001:888:1036:0:2a0:ccff:fec8:f25c: AH(spi=0x00005fb4,sumlen=16,seq=0xce): ESP(spi=0x00005fb5,seq=0xce) [hlim 0] (len 1456) Both of these hosts have your patch applied. So it seems that ip6sec fragmentation has some issues. Thanks. -- http://www.PowerDNS.com Open source, database driven DNS Software http://lartc.org Linux Advanced Routing & Traffic Control HOWTO http://netherlabs.nl Consulting From Robert.Olsson@data.slu.se Sat Mar 22 12:28:29 2003 Received: with ECARTIS (v1.0.0; list netdev); Sat, 22 Mar 2003 12:28:40 -0800 (PST) Received: from robur.slu.se (robur.slu.se [130.238.98.12]) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h2MKSRq9000466 for ; Sat, 22 Mar 2003 12:28:29 -0800 Received: (from robert@localhost) by robur.slu.se (8.9.3/8.9.3) id VAA16924; Sat, 22 Mar 2003 21:28:15 +0100 From: Robert Olsson MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Message-ID: <15996.51038.928453.501527@robur.slu.se> Date: Sat, 22 Mar 2003 21:28:14 +0100 To: "Feldman, Scott" Cc: Robert Olsson , Jeff Garzik , netdev@oss.sgi.com Subject: RE: [Fwd: [E1000] NAPI re-insertion w/ changes] In-Reply-To: References: X-Mailer: VM 6.92 under Emacs 19.34.1 X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) X-archive-position: 2030 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: Robert.Olsson@data.slu.se Precedence: bulk X-list: netdev Content-Length: 911 Lines: 25 Feldman, Scott writes: > > It's clean but I have some concerns... > > Thanks for the feedback. It's a twist on the previous driver where we > disabled/enabled interrupts each time we went in/out of polling. Trying > to avoid those extra PCI writes. My experience is that you have to > really load up the interface to stay in polling mode (get up on step). True. Making interrupt delay larger will collect more packets on RX-ring and have the two PCI-writes to disable/enables irq to be shared by many packets. > Should be the same interrupt rate with or without NAPI. When NAPI stays in polling there are no interrupts and no extra PCI-writes so the high-load situation is optimized. So I fear that interrupts are now added to the high-load situation and this will impact top performance -- especially with many NIC's. But lets see what comes out from testing. Cheers. --ro From davem@redhat.com Sun Mar 23 01:19:00 2003 Received: with ECARTIS (v1.0.0; list netdev); Sun, 23 Mar 2003 01:19:11 -0800 (PST) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h2N9IKq9008361 for ; Sun, 23 Mar 2003 01:19:00 -0800 Received: from localhost (IDENT:davem@localhost.localdomain [127.0.0.1]) by pizda.ninka.net (8.9.3/8.9.3) with ESMTP id BAA18898; Sun, 23 Mar 2003 01:16:14 -0800 Date: Sun, 23 Mar 2003 01:16:14 -0800 (PST) Message-Id: <20030323.011614.34129941.davem@redhat.com> To: toml@us.ibm.com Cc: netdev@oss.sgi.com, kuznet@ms2.inr.ac.ru Subject: Re: [PATCH] (Updated) IPSec: IPv6 source address not set correctly in xfrm_state From: "David S. Miller" In-Reply-To: <1048268391.1244.2.camel@tomlt2.tomloffice.austin.ibm.com> References: <1048268391.1244.2.camel@tomlt2.tomloffice.austin.ibm.com> X-FalunGong: Information control. X-Mailer: Mew version 2.1 on Emacs 21.1 / Mule 5.0 (SAKAKI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) X-archive-position: 2032 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev Content-Length: 138 Lines: 6 From: Tom Lendacky Date: 21 Mar 2003 11:39:50 -0600 Here is the updated patch. Applied, thanks a lot Tom. From davem@redhat.com Sun Mar 23 01:24:56 2003 Received: with ECARTIS (v1.0.0; list netdev); Sun, 23 Mar 2003 01:25:02 -0800 (PST) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h2N9OZq9008774 for ; Sun, 23 Mar 2003 01:24:56 -0800 Received: from localhost (IDENT:davem@localhost.localdomain [127.0.0.1]) by pizda.ninka.net (8.9.3/8.9.3) with ESMTP id BAA18924; Sun, 23 Mar 2003 01:22:30 -0800 Date: Sun, 23 Mar 2003 01:22:29 -0800 (PST) Message-Id: <20030323.012229.36386243.davem@redhat.com> To: jgrimm2@us.ibm.com Cc: linux-net@vger.kernel.org, netdev@oss.sgi.com Subject: Re: [PATCH] Fix ip6_build_xmit bug From: "David S. Miller" In-Reply-To: <3E7BAC7E.AEC59251@us.ibm.com> References: <3E7BAC7E.AEC59251@us.ibm.com> X-FalunGong: Information control. X-Mailer: Mew version 2.1 on Emacs 21.1 / Mule 5.0 (SAKAKI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) X-archive-position: 2033 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev Content-Length: 819 Lines: 21 From: Jon Grimm Date: Fri, 21 Mar 2003 18:21:18 -0600 Looks like ip6_build_xmit does not allocate room for the dev->hard_header_len on the non-fragmentation path as is done in other places. The hard header len gets reserved even though room was not allocated for it. Consequenetly, the put of the raw data can overflow the skb. Patch below for your consideration. Applied, but with a minor fix. We now have a LL_RESERVED_SPACE(dev) macro in include/linux/netdevice.h that gets this formula correct and thus I have used it. Thanks. And yes we do know things are still slightly broken with ipv6 fragmentation wrt. IPSEC, and that is being actively worked on. The IPV4 output path hacks just need to be duplicated into ipv6 before that will start working reliably. From wichert@wiggy.net Sun Mar 23 02:38:39 2003 Received: with ECARTIS (v1.0.0; list netdev); Sun, 23 Mar 2003 02:38:43 -0800 (PST) Received: from mx1.wiggy.net (IDENT:zKwQVArB7s0i7O3jQxufg73SN1h1qzOF@home.wiggy.net [213.84.101.140]) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h2NAbsq9010689 for ; Sun, 23 Mar 2003 02:38:36 -0800 Received: from wichert by mx1.wiggy.net with local (Exim 3.35 #1 (Debian)) id 18x2rN-0000EZ-00; Sun, 23 Mar 2003 11:37:49 +0100 Date: Sun, 23 Mar 2003 11:37:49 +0100 From: Wichert Akkerman To: linux-kernel@vger.kernel.org, netdev@oss.sgi.com Subject: ipv6 ipsec tunnel mode produces broken packets Message-ID: <20030323103748.GH21175@wiggy.net> Mail-Followup-To: linux-kernel@vger.kernel.org, netdev@oss.sgi.com Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="qMm9M+Fa2AknHoGS" Content-Disposition: inline User-Agent: Mutt/1.3.28i X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) X-archive-position: 2034 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: wichert@wiggy.net Precedence: bulk X-list: netdev Content-Length: 46174 Lines: 643 --qMm9M+Fa2AknHoGS Content-Type: text/plain; charset=us-ascii Content-Disposition: inline I have been playing a bit with ipv6 ipsec support on a stock 2.5.65 kernel. Transport mode seems to work excellent, and I've been able to talk to FreeBSD and OSX boxes using it. I can't seem to get tunnelmode to work correctly though. Here is the config that I feed to setkey: add 3ffe:8280:10:1d0:290:27ff:fe2d:968c 2001:888:1037::bad:c0f:fee esp 666 -m tunnel -E 3des-cbc "210987654321098765432112"; add 2001:888:1037::bad:c0f:fee 3ffe:8280:10:1d0:290:27ff:fe2d:968c esp 666 -m tunnel -E 3des-cbc "210987654321098765432112"; add 3ffe:8280:10:1d0:290:27ff:fe2d:968c 2001:888:1037::bad:c0f:fee ah 666 -m tunnel -A hmac-md5 "6543210987654321"; add 2001:888:1037::bad:c0f:fee 3ffe:8280:10:1d0:290:27ff:fe2d:968c ah 666 -m tunnel -A hmac-md5 "6543210987654321"; spdadd 3ffe:8280:10:1d0:290:27ff:fe2d:968c/128 2001:888:1037::0/48 any -P out ipsec esp/tunnel/3ffe:8280:10:1d0:290:27ff:fe2d:968c-2001:888:1037::bad:c0f:fee/require ah/tunnel/3ffe:8280:10:1d0:290:27ff:fe2d:968c-2001:888:1037::bad:c0f:fee/require; spdadd 2001:888:1037::0/48 3ffe:8280:10:1d0:290:27ff:fe2d:968c/128 any -P in ipsec esp/tunnel/3ffe:8280:10:1d0:290:27ff:fe2d:968c-2001:888:1037::bad:c0f:fee/require ah/tunnel/3ffe:8280:10:1d0:290:27ff:fe2d:968c-2001:888:1037::bad:c0f:fee/require; 3ffe:8280:10:1d0:290:27ff:fe2d:968c is my machine, and 2001:888:1037::bad:c0f:fee is a remote FreeBSD box. When I try to ping6 the remote machine tcpdump seems some 'interesting' traffic: 11:25:40.990850 tornado.wiggy.net > prut.net: AH(spi=0x0000029a,sumlen=16,seq=0x90): truncated-ip6 - 8061 bytes missing!1037::bad:c0f:fee:0:29a > 0:90:a945:bdc0:130:338f:30b8:c5d1: ip-proto-8 8193 [class 0xe2] [flowlabel 0xd968c] (len 8193, hlim 136) [flowlabel 0x29a] (len 172, hlim 144) 11:25:42.014570 tornado.wiggy.net > prut.net: AH(spi=0x0000029a,sumlen=16,seq=0x91): truncated-ip6 - 8061 bytes missing!1037::bad:c0f:fee:0:29a > 0:91:9bc6:ed67:e7ce:8d58:ade2:2d57: ip-proto-8 8193 [class 0xe2] [flowlabel 0xd968c] (len 8193, hlim 136) [flowlabel 0x29a] (len 172, hlim 145) 8061 bytes missing from a packet strikes me as somewhat odd. If I look at the packet with ethereal it looks as if the ESP transform produces incorrect data; I can see part of the destination address in the source address. I have attached the tcpdump to this message. Wichert. -- Wichert Akkerman http://www.wiggy.net/ A random hacker --qMm9M+Fa2AknHoGS Content-Type: application/octet-stream Content-Disposition: attachment; filename=dump Content-Transfer-Encoding: base64 1MOyoQIABAAAAAAAAAAAAP//AAABAAAApIt9PoIeDwDiAAAA4gAAAABQBAvdeQCQJy2WjIbd YAACmgCsM5A//oKAABAB0AKQJ//+LZaMIAEIiBA3AAAAAAutDA8P7ikEAAAAAAKaAAAAkP5c 06RDZLcEqDpYrP4tlowgAQiIEDcAAAAAC60MDw/uAAACmgAAAJCpRb3AATAzjzC4xdH7Wqrx OjT7RpJSzzUyC6EHCKfIzhyACnWgkEvkDrd7XUQq3lMwsNJGpiPgJYypfoODOHdXf2M/uYKW 0efJQaZxAy71dUE/61lH0vKjlq9T5ke+GV/Yby3+9bkIrforJakYxhr6m8btZ+fOjVimi30+ 6jgAAOIAAADiAAAAAFAEC915AJAnLZaMht1gAAKaAKwzkT/+goAAEAHQApAn//4tlowgAQiI EDcAAAAAC60MDw/uKQQAAAAAApoAAACR9zebD9TCI4LBLy6n/i2WjCABCIgQNwAAAAALrQwP D+4AAAKaAAAAkZvG7Wfnzo1YreItV+QHBqgGVZUDN0UC11tEomx9XV1DQSGWciKBONSOuMtm DFoq8jntHjTmg5Qr/eXB+ZKqJWgIIZqXWuZGMcoJQzg+DijXWqAStvdQafSHQqA4s7eEPmSe HTekNzHuoOi/bmVsz1hDKmeXFmzzG6eLfT7nlQAA4gAAAOIAAAAAUAQL3XkAkCctloyG3WAA ApoArDOSP/6CgAAQAdACkCf//i2WjCABCIgQNwAAAAALrQwPD+4pBAAAAAACmgAAAJLRt8Pk m0KNKeRhjDf+LZaMIAEIiBA3AAAAAAutDA8P7gAAApoAAACSQypnlxZs8xs4ftgty3l/26Yj 9pCoKS6jAuY2KSzLruFxmiA1QIhE76EJHcKfqZKrrKZiyyBzd0ARPzA8MQNXNDW216pYVuys +iC4LIfQSKQe8BjN2Z2Jy5wUAuKVMkCEgFSKi1lyfbLcoZlbTncNURi4LJggiTPqp4t9PtCM BACGAAAAhgAAAABQBAvdeQCQJy2WjIbdYAAAAABQBkA//oKAABAB0AKQJ//+LZaMP/6CgAAQ AdACUAT//gvdeYAFABahJQ4Q1ATDoYAY9XBHggAAAQEICgA1nQREMZJoeMhqfMVW7KnJwqnk IXntBjjkRv7SyZ53bdMvMn5sqNS+osOFIF7w/kleY7TTfvL8p4t9Pk8jBQBWAAAAVgAAAACQ Jy2WjABQBAvdeYbdYAAAAAAgBkA//oKAABAB0AJQBP/+C915P/6CgAAQAdACkCf//i2WjAAW gAXUBMOhoSUOQIAQK/DULwAAAQEICkQxoy8ANZ0Ep4t9Pk3+BwCGAAAAhgAAAABQBAvdeQCQ Jy2WjIbdYAAAAABQBkA//oKAABAB0AKQJ//+LZaMP/6CgAAQAdACUAT//gvdeYAFABahJQ5A 1ATDoYAY9XDwTwAAAQEICgA1neVEMaMvnzHp0+vq9lhHpzBn8blqikfU4Od+zWH4OOVc7bRu 2Q3fauJafciIJ+TKbYn4UZp8p4t9Pvz+BwBWAAAAVgAAAACQJy2WjABQBAvdeYbdYAAAAAAg BkA//oKAABAB0AJQBP/+C915P/6CgAAQAdACkCf//i2WjAAWgAXUBMOhoSUOcIAQK/DTDAAA AQEICkQxo0EANZ3lp4t9PoQkCACGAQAAhgEAAACQJy2WjABQBAvdeYbdYAAAAAFQBkA//oKA ABAB0AJQBP/+C915P/6CgAAQAdACkCf//i2WjAAWgAXUBMOhoSUOcIAYK/CVBgAAAQEICkQx o0IANZ3luYN8eTRcB0rBlOeP2RMjiRuNbX73LE7Euk97pZ2wjtrPzUCuzxAEAqItr0TgtUtN v9429/+WjS8rY3prXDUBLWVFYIYsv1h9X/2ykOUEaLrUapglMxl6JGtG4+Sfwi5m0l8UdH50 i7RW5gl2E2SluggLOUwitsZ0b2DtDlFchio6tiVu7q5BDaoRuXV8+t9PgSNcjmvCUnA5i2pO xgir1nj6v+pHbVTvi687r5qZi91vTph5z8sgghdvFI9VFMX2db9tPqLMSpKrD/vAMcxTnRNl czsCll3NxiNtkfjxXC3xnlnKksJxNVmZw71qjYPWIny/9BGGp2BkuW3lcdp1qRwlzfsV6NGl vMXeZbG4uMW4wCW/H9lL9DV3xS5bARC2kcxOvM1CURkGSlAWXFZmhKeLfT7+JAgAVgAAAFYA AAAAUAQL3XkAkCctloyG3WAAAAAAIAZAP/6CgAAQAdACkCf//i2WjD/+goAAEAHQAlAE//4L 3XmABQAWoSUOcNQExNGAEPVwCFEAAAEBCAoANZ3vRDGjQqeLfT6fJggAhgEAAIYBAAAAkCct lowAUAQL3XmG3WAAAAABUAZAP/6CgAAQAdACUAT//gvdeT/+goAAEAHQApAn//4tlowAFoAF 1ATE0aElDnCAGCvwwgIAAAEBCApEMaNCADWd79QnjgSxnQqDYU7W9uX8oxvrMjxp8gzqFqGd lcdN+GEnmBDCWjR5cAaaqeEETgFWUdZGA9MAyuzN5l8OIsGx8I/XlGnrTYYq4fNVCYTuc2gc YtTYa9CsqQVzNf6Wh48x8XxI63rj/oKjOIKZ0pQq5OzKLilO6HJiLuMvqBvb9yd+3+jGxY9D DmxsL32IT9vRQFAwCG4ZTtWPDMwaGvnKQp1imfMHmBrRqkmqB07jFRe1rG49cxZ0kILbcbo+ kTdh1lYBJEJuJbgcpJicbw2b96PBY00ARlp1uLy5ZLRaJhrqBi33TDZvBO7JPMkVR9prLsIq WJr00nNhzo3SoDQUib+nGGM3pKOyaFAmFenM7xTgcJnWgDK2KG5XRxbzYgDQ8C+5l1GgSIrT 6ZEhM29qw1ani30+HCcIAFYAAABWAAAAAFAEC915AJAnLZaMht1gAAAAACAGQD/+goAAEAHQ ApAn//4tlow//oKAABAB0AJQBP/+C915gAUAFqElDnDUBMYBgBD1cAcgAAABAQgKADWd8EQx o0Kni30+dygIAIYBAACGAQAAAJAnLZaMAFAEC915ht1gAAAAAVAGQD/+goAAEAHQAlAE//4L 3Xk//oKAABAB0AKQJ//+LZaMABaABdQExgGhJQ5wgBgr8KtPAAABAQgKRDGjQgA1nfDnkkms UVVu5+q1SuJOzhsVYkqfWtPIX7YjfnOtunim1F/kOPgnrcuhJuyj46/AwghXki033dnVCjTh FsjPx1n5T9ZF5AYqDlmMUQX1U0fPHU40riL0oX+/9EnCuhFA5oArbb6zLJ3YT4ItztuSP4hT f05rlaXx1pBWT+pjSMYNcPhI5KB7fcWfEzAPbhfXLI44a8yhBqiz9HhpKwcSl9//hsU/QwVD fA4ZOFaGlpA/vxC9J5lsy+1D4XreAmut8i5OfniLEAIJlTIX2bDB+OR0RwnDbfSIHXnF6kjD zvLeIbHlWj9Nf92z+dqedCYdl1JMpQn/MjxkUvx2JVS+DeabF+NxDNS6XhbRb4emmGd66WJB TA8LwmRfh809ilIN8IpLuAGtFEB1G+i29bp9O9OWp4t9PuQoCABWAAAAVgAAAABQBAvdeQCQ Jy2WjIbdYAAAAAAgBkA//oKAABAB0AKQJ//+LZaMP/6CgAAQAdACUAT//gvdeYAFABahJQ5w 1ATHMYAQ9XAF8AAAAQEICgA1nfBEMaNCp4t9Pk4qCACGAQAAhgEAAACQJy2WjABQBAvdeYbd YAAAAAFQBkA//oKAABAB0AJQBP/+C915P/6CgAAQAdACkCf//i2WjAAWgAXUBMcxoSUOcIAY K/DOqQAAAQEICkQxo0IANZ3wwSLD0tMC1ZrHPN4sKjFNnubnj5fnoNsdJX71luvfbyPeDsz3 qLXRRv4FBS+xph/jpQtqDdzX95tCPtO1qz6UNg2I0XLPAI+sK+mrqLYMfkWlPGFm8iatOv+q AWA0NYEdbK/tE5o1665Q5XTT0jIul20aLKxwKVUCz02G8Jqangm7KhFl/scdwaNpht/acwex 8vmnTK/evgEiQqqZ2ikpqEaJNxm+RC5BnGZvsQy+ac2ky7sDvjbFX1b1EXR+Ll27+0rcZSOc 0NSsi+8VUJk1A+p3BCveUm0s0XDkS7TS+99fNOBpucaw5lnkm2lEi+65/GymbHo+jTWE9Dtj tMbiMFZmwed0jftaRaxE9qOGvZkIznpKBDWybhtn6UqRg4TMCHDxznMoy2igvdyypDEwTaeL fT6/KggAVgAAAFYAAAAAUAQL3XkAkCctloyG3WAAAAAAIAZAP/6CgAAQAdACkCf//i2WjD/+ goAAEAHQAlAE//4L3XmABQAWoSUOcNQEyGGAEPRABe8AAAEBCAoANZ3xRDGjQqeLfT4uLAgA hgEAAIYBAAAAkCctlowAUAQL3XmG3WAAAAABUAZAP/6CgAAQAdACUAT//gvdeT/+goAAEAHQ ApAn//4tlowAFoAF1ATIYaElDnCAGCvwt9AAAAEBCApEMaNCADWd8avBmmJVrs0kGIoxP3kt N6alwcSqza7R32Dtj3izUUFL/I0/aYZUqw0C0HTf6FuCVZDsarVhaNORa/ljYXgAReAAoFNG RKnuwNKVQv1aXIiIJa3ec3zAm/+rq938jPw2r1iO++oW8iPB+7yPLK9cpK/+HRYL0FqEQx6M 4tPEcjybAsq7h4MpOWcpZ2gaKEJtKyRbMJSvEJX4OEXw0iVhwUD+NOHQR7XjZjDw6L3rKCk3 qPyv7ZleFw2m0wLAc3LTcH/WjPGCPZ7Hx4dZnySgK6CTLcBRzEn3hvo/CkJ7byn2RXLc5ifH EKCQ+8vZS4EAC809M2hStdzTo31Ik/4dY8aoshAoxhYZZmeCnI4cVm4HV2Vmt6/CxqSkXe17 zViS2O0YEST1KLQVZH2cg7KI6S2ni30+pCwIAFYAAABWAAAAAFAEC915AJAnLZaMht1gAAAA ACAGQD/+goAAEAHQApAn//4tlow//oKAABAB0AJQBP/+C915gAUAFqElDnDUBMmRgBD1cAOP AAABAQgKADWd8UQxo0Kni30+BS4IAIYBAACGAQAAAJAnLZaMAFAEC915ht1gAAAAAVAGQD/+ goAAEAHQAlAE//4L3Xk//oKAABAB0AKQJ//+LZaMABaABdQEyZGhJQ5wgBgr8NTzAAABAQgK RDGjQgA1nfHJlyKsW6iWUoGGLFqtrHFUS+oZxrrjJd15XCjOIKYJw62Dbz4B+iKtRnTvu2ox d6Q58HU6T6By2w2w4TK7h+f+14NWkYhJTNBh0JOz5DNO0I6XNbpdUSgdk66iIjGzzLwPbQpH q1k3BGc/lvxjlGj3FyTm9GUUSyidxnoyF1m/vyslcZ6odWpmcE5TJnqeOmBKvAxtUI08MA20 jTUnME0htooUwVrj+5FwGJkFUQ5AK+QSNCsDEoTWwW5rAwC1fRRQckR9CugMoZ9mATW4eNdV 4vXPppKwoVp2dEGcf9xNZ5eXIHyjqpSsZXao8BLJ6ixYIZp1UT9vXvukOR3fXp74VYNaeYH1 YyfGdz4BEbjYZuf8bJ9sDr4BMlAjEhdCZZ6I04UyqH6HUb1d300pACh8p4t9PmouCABWAAAA VgAAAABQBAvdeQCQJy2WjIbdYAAAAAAgBkA//oKAABAB0AKQJ//+LZaMP/6CgAAQAdACUAT/ /gvdeYAFABahJQ5w1ATKwYAQ9XACXgAAAQEICgA1nfJEMaNCp4t9PtsvCACGAQAAhgEAAACQ Jy2WjABQBAvdeYbdYAAAAAFQBkA//oKAABAB0AJQBP/+C915P/6CgAAQAdACkCf//i2WjAAW gAXUBMrBoSUOcIAYK/B+xgAAAQEICkQxo0IANZ3yil9gnlajMttDFxYpIuDdfLzu7FfFECs3 Se3/i+l2/YgCBgf1WFwCL7mAwGOMFcV1leB+YYien8D90IZr8btAdw/mZQ1Yr+vIxOUYtB5O FjvWWu9aey4tMtiLYzoO6+CmRo2m4gN8YCFx85GiLi1wMnSpNC0DSACgvYvccM9eg8BlsLka KVLaEIVMpkWhGPaEnFQxsW9BCtH5pP6jg2hP1jxSAW81N6EVlhdabNj9ZHlEcoPa96Od0fGx uBJW2t28mKZqViSftDWQ5p8RlQ/X91W20Gekw67bTrcmHVCw+PEV+S10otyD9AwaBiromGxO BBbLloZ+zsQWme/0G0eIC5uuJbBz+kA2WBKvebv474Fy4Z/P2pNUxE0ZoPMGRwmOYSoju3Zj UBogAnbG1de18aeLfT5DMAgAVgAAAFYAAAAAUAQL3XkAkCctloyG3WAAAAAAIAZAP/6CgAAQ AdACkCf//i2WjD/+goAAEAHQAlAE//4L3XmABQAWoSUOcNQEy/GAEPVwAS4AAAEBCAoANZ3y RDGjQqeLfT7AMQgAhgEAAIYBAAAAkCctlowAUAQL3XmG3WAAAAABUAZAP/6CgAAQAdACUAT/ /gvdeT/+goAAEAHQApAn//4tlowAFoAF1ATL8aElDnCAGCvwCYkAAAEBCApEMaNDADWd8jnl d7DuRiVPyXm/l4bFV+tvc4iede29/qsIJMH1IgEbsDrbFtmFFRqcuUch+72+IzI2vkr7+X4d DO+9gqgkhZ4+0o4ONpCrjVook4VGWtnx3A0ySzhkpGUZQm3ueV9Q8SAeN/F6LN4xgaDgKVga cCq+LZBX/UPHcXURrjVtpaJk0mqxv0DhdI/hKSHRZ4O3aWCuRndzThoo6zdqqqd/jsu1ydDX 5m+I+E8XnoQEkRMR8AbdcqGeHqKevUJsq3Vllvsd7dBKYhFF+2JR2PdHXOuKalRc6XzQJYet v5ASdyodOeEEaeJXuYdgIHqcVB5+kqzUEXo6xU+Txr+LLhey+/8BhSyuWHNHBeRywfaK07bT rJoRaOc62kb3e9cf1R5en7K832ZxkXKah/+JVFWGJ2yni30+IzIIAFYAAABWAAAAAFAEC915 AJAnLZaMht1gAAAAACAGQD/+goAAEAHQApAn//4tlow//oKAABAB0AJQBP/+C915gAUAFqEl DnDUBM0hgBD1cP/7AAABAQgKADWd80Qxo0Oni30+ljMIAIYBAACGAQAAAJAnLZaMAFAEC915 ht1gAAAAAVAGQD/+goAAEAHQAlAE//4L3Xk//oKAABAB0AKQJ//+LZaMABaABdQEzSGhJQ5w gBgr8KsDAAABAQgKRDGjQwA1nfPnzUYZ9Sg6SH3u0PAH/MIrH6+OmRtcP5ecKBAJixNyHw9A bR+cD6QVkYQlxKV84lO8g/uKpy7iqiIqe9Zr2M7vKMbkTkVCRzkz6m3YgRbWLuwhYm1yf6/g HxFMJFFebg71Zq4viyuGOFGfg8I8dWbfG9+K8ig3xS2ovPGbjyElP2GRJs7h9XCfopjBtr4L bedKeZQO371I3FXkERSHw5FnmtilM6XV8UDgcd33iB9tasho560J9TxR+ILXONXh0riZLTUU liINE8Am+ZrI59J6GNpJ+HkQIFCBngCMGKiM3AosEj1r1CBpor3Mx8qP3QjTtQ1GkB7nai+A gTLvlRX2Nbhe1xy7eF9Wujs/ISupMm4ZbHn/Ho8IVz2AY8tBYBkq6rW94I2jsJ3edOUBT48n p4t9Pgs0CABWAAAAVgAAAABQBAvdeQCQJy2WjIbdYAAAAAAgBkA//oKAABAB0AKQJ//+LZaM P/6CgAAQAdACUAT//gvdeYAFABahJQ5w1ATOUYAQ9ED/+wAAAQEICgA1nfNEMaNDp4t9PnA1 CACGAQAAhgEAAACQJy2WjABQBAvdeYbdYAAAAAFQBkA//oKAABAB0AJQBP/+C915P/6CgAAQ AdACkCf//i2WjAAWgAXUBM5RoSUOcIAYK/B7zgAAAQEICkQxo0MANZ3zNoqhc8X3EcPDVMFF E397g/zOfQVrdftNdqg6zr4fBOIcOss7p4zoruQqbHsq4KfZbhjxrhZWlAye2OUSdQhxPOWu qxk8QaG0gtUKS4XwruFp7tqZrAQnZSh19yi2MVIrrBXVVbTPxzcnY9GaIDieSyJMthfj2tjD yoe8hJLtxDFJI4ujd9BHrdJJMYMW1Ac5CNsuPcKbo6vsSk1au4lBYxokEkomWd4vh+2I8NFU V72aAEpd1tDfx49Qv9oUSYiAKW3UPNYD2dl7L5frhW245Ljg1sjuGLU2WOZzkI8/YMx05Zkw YsE9QuVnfAvrU0aT3IcKLIKmU5WUInmP1wthoql6TXFgSdmLeZ1czpVkmaQEv/RtAedU73+O VNWz0OpqLiK82ggRRIbk1Q6BUPJm46eLfT7lNQgAVgAAAFYAAAAAUAQL3XkAkCctloyG3WAA AAAAIAZAP/6CgAAQAdACkCf//i2WjD/+goAAEAHQAlAE//4L3XmABQAWoSUOcNQEz4GAEPMQ //oAAAEBCAoANZ30RDGjQ6eLfT5BNwgAhgEAAIYBAAAAkCctlowAUAQL3XmG3WAAAAABUAZA P/6CgAAQAdACUAT//gvdeT/+goAAEAHQApAn//4tlowAFoAF1ATPgaElDnCAGCvwrtIAAAEB CApEMaNDADWd9PqNnSVREPAHaZpdH4gRV2xtERncDDaAJcNZ4vxuZhDGlXvPlr2iLy9jXt/B 7BYXNHfBnuXJ/FM9BPlkSpo96fY8NRDt4lgS6PRuD+H0nsQjytQazd4RUREWPSxBmQofbpXQ 8Yl9oJVN7QY3xkI9pRlWQnXVNbHDlgTrDI7dzmgfVztLnuLiubg/OfSGxW9DfS9ISOgGYGnx i7PlznjekH1Eg+yKtzvuy6rpVSfm3Daz0/OcMUlSqm+y6FfeTj4LzCl1OwEvseCjuF36WKgN CdBb62fSUiQEaflshRqYW5+hXAuJ/BabaoRaH+62LcHCAElMEn84fknGofbvdzc6izuKEldW 43lGAKO2e5tNOSDLMu9NF6K8eO4RiSwdYqMaFG6cvwpgNYQcCB/gf23tY46ni30+vTcIAFYA AABWAAAAAFAEC915AJAnLZaMht1gAAAAACAGQD/+goAAEAHQApAn//4tlow//oKAABAB0AJQ BP/+C915gAUAFqElDnDUBNCxgBDx4P/6AAABAQgKADWd9EQxo0Oni30+GjkIAIYBAACGAQAA AJAnLZaMAFAEC915ht1gAAAAAVAGQD/+goAAEAHQAlAE//4L3Xk//oKAABAB0AKQJ//+LZaM ABaABdQE0LGhJQ5wgBgr8DlpAAABAQgKRDGjQwA1nfRHyK3rJALp+231+6uX1M/bIgeSH+Yj Mu/eaAhwUfdR56qDKb9AWJgHYHfRxS1GAq0xUxtP1DwpCMu8kYDLVjaJCpySiOwHhQJwSCDj TG9GhhoYFfL2HRG8Z/yCbyWcxdV3BMHRvnKBsoVZjv0RocBBFLESoh5ogAg2wIchJ8ATNF3y tKfZVA1wV5OMDm8yAkrJhxijW6vYaJ72F0oamfUvb1gc/nQCR2dZQwgPXVbeXO5AM6UG6hHr N2VwH1nmoWx7yidCdtQFK9O4aaMzQFD86p4rbmfvWfAdi/j6cJfmNDrVo4hLXROxFslQBy2t By7KWJfgJV07HP6ligTCTMBsJdM9FvQfDPdfVC8DtfGUy+XgnS0OvK7OlI2ehOq5G1nMIVF7 UiEZQaZtbwpnitGUp4t9Pp85CABWAAAAVgAAAABQBAvdeQCQJy2WjIbdYAAAAAAgBkA//oKA ABAB0AKQJ//+LZaMP/6CgAAQAdACUAT//gvdeYAFABahJQ5w1ATR4YAQ8LD/+QAAAQEICgA1 nfVEMaNDp4t9PuU6CACGAQAAhgEAAACQJy2WjABQBAvdeYbdYAAAAAFQBkA//oKAABAB0AJQ BP/+C915P/6CgAAQAdACkCf//i2WjAAWgAXUBNHhoSUOcIAYK/AnUAAAAQEICkQxo0MANZ31 l2/9aXbbnzIEnPGdeN3a7elDzCEwzpau7fLXca/E/4Ko41xwPhosIadI/1EXpd13DpmvJXUg 10NuLffvrnCwQctxF5WPtSU0JrD9AyXmGHWDoJnY3rl5wZZYCdghOHxrpWYhM4UK5ZgLnYEs 0fAEQ/9SPbMeI84KsVRMXuT7hqn9O9CUEcnBT3YfUYU/3E5FkRq4oZUUOpibafQiDpIiwOPX zwRooeE9mCYW5KSJM3TgrSTaUE1ueC/8wDmsM+LdtJAcNab5LMXqvTN9rV6+iVADBS0BQWqL NbQcvperu1MNgusUqKvjIrgi90HqBPzlN+Eq+gUTcdALpETj2duNY9JR3J4Lmcvam0H6mgsP 594AEKgW2JC2I26z3r6ohzLTOtkcmg1GQSpUZOYdSE2Vv6eLfT5WOwgAVgAAAFYAAAAAUAQL 3XkAkCctloyG3WAAAAAAIAZAP/6CgAAQAdACkCf//i2WjD/+goAAEAHQAlAE//4L3XmABQAW oSUOcNQE0xGAEO+A//kAAAEBCAoANZ31RDGjQ6eLfT6OPAgANgEAADYBAAAAkCctlowAUAQL 3XmG3WAAAAABAAZAP/6CgAAQAdACUAT//gvdeT/+goAAEAHQApAn//4tlowAFoAF1ATTEaEl DnCAGCvwYmkAAAEBCApEMaNDADWd9T87av1dCW8tyO61Ci7o86sA/cNAtSS9a9KMUgnAfjzk suvIf8XZ97GeemOHiIgo1snGNqeMR8T+VAScC01h1U0SoDGqvVSIzgvwbvPgLZjhL/SFEu6X JrRevN1/h64Z6dNRkqNJiREZ1a5kSGqx5Me6RWWjQAszPMKvG+9k3ASdd0Xo0xPq4S1quN8e 5xs9/w2W7xfipRGhUA3TG5xtXspZ0t/chEWJNNNzAlbQpI8PT5HUJFLu/bGQMQIfoETwY/vP GZWTLtdW2Aq7ARr2TxSWD9EtlWM4vV0V3XpE1Tqbp4t9Phg9CABWAAAAVgAAAABQBAvdeQCQ Jy2WjIbdYAAAAAAgBkA//oKAABAB0AKQJ//+LZaMP/6CgAAQAdACUAT//gvdeYAFABahJQ5w 1ATT8YAQ7qD/+QAAAQEICgA1nfVEMaNDqIt9PuXyAADiAAAA4gAAAABQBAvdeQCQJy2WjIbd YAACmgCsM5M//oKAABAB0AKQJ//+LZaMIAEIiBA3AAAAAAutDA8P7ikEAAAAAAKaAAAAk9oU v2Ql7EB+ZJAUGv4tlowgAQiIEDcAAAAAC60MDw/uAAACmgAAAJMYuCyYIIkz6rOwUediwIzk Jq5wcNC/igxSG8alArund3p/+jcdn7BgSgcORZ6JXIdacuCtMKmQ0GkuEG6qhqh3E8RGyC32 FfKNWdYi4grqCxJIE/H3aAxaf78P/ZA/U0AVMTitjZcjccdZZvQUSQWS7Viss/o/xvapi30+ bVEBAOIAAADiAAAAAFAEC915AJAnLZaMht1gAAKaAKwzlD/+goAAEAHQApAn//4tlowgAQiI EDcAAAAAC60MDw/uKQQAAAAAApoAAACUhC7i5PbaSbQweKR6/i2WjCABCIgQNwAAAAALrQwP D+4AAAKaAAAAlO1YrLP6P8b2vYkey0zoyhd7DCe/yPWlzAv3s6ThazvnZQ0i9oC7zKZzGpRR iQ/VZpU09ZE9dOfgUU1GR+sTygdN9pDWtE/p4hIT08LgricCaCtjsKB2GKbqehOfiPehpBk6 KbUNWXmVKAmYcPsO+5en9gnPl3e1aKmLfT5TMQIAhgAAAIYAAAAAUAQL3XkAkCctloyG3WAA AAAAUAZAP/6CgAAQAdACkCf//i2WjD/+goAAEAHQAlAE//4L3XmABQAWoSUOcNQE0/GAGPVw XnYAAAEBCAoANaQ5RDGjQ4UO5h51sxm/khjl5hqW7R4ifIjZaVbVHtjjFqwRNK2w/cKfVMbL XHZ4bTwW1fQ20amLfT7YsQIAVgAAAFYAAAAAkCctlowAUAQL3XmG3WAAAAAAIAZAP/6CgAAQ AdACUAT//gvdeT/+goAAEAHQApAn//4tlowAFoAF1ATT8aElDqCAECvwu5IAAAEBCApEMaPn ADWkOamLfT5gcwYAhgAAAIYAAAAAUAQL3XkAkCctloyG3WAAAAAAUAZAP/6CgAAQAdACkCf/ /i2WjD/+goAAEAHQAlAE//4L3XmABQAWoSUOoNQE0/GAGPVwM+cAAAEBCAoANaVRRDGj59q6 aDlMJRpOn/GrgZe+1QZgm72O+wNPQKl/INZLRRJG9D4IjYPJblXestVGDUIaxKmLfT4QdAYA VgAAAFYAAAAAkCctlowAUAQL3XmG3WAAAAAAIAZAP/6CgAAQAdACUAT//gvdeT/+goAAEAHQ ApAn//4tlowAFoAF1ATT8aElDtCAECvwujIAAAEBCApEMaP/ADWlUamLfT6vhQYAhgEAAIYB AAAAkCctlowAUAQL3XmG3WAAAAABUAZAP/6CgAAQAdACUAT//gvdeT/+goAAEAHQApAn//4t lowAFoAF1ATT8aElDtCAGCvwCogAAAEBCApEMaQAADWlUV24oDjIyigBP9zplvk7usR1WW3Q 2NG8QEgfQFww0E9+Xvc5oP5C4XQECfHCfzSCI9RlmGuuMlUZcMEmC+v6P/oqwaAPvoz/I5x0 qdv7Xr3wHLTVpLEz8XhTrCIFXSV+NoYRyyzN+HXgcJ/CssKDj7GAn+mj2PAjKtXx38GUdGTv Be1cTYqpObyGuSyvRHw1eRNEJpU+Q/aRJR2t99Oz3pPGS3uUmZpX39b4VXxbV26G1mRqvzK4 YPdwosCjGrd24BTxdadCKoz1kThZCnBY7vP0aRQ3u/KwM3Gun8kHmCpMJgwppcwou+bSF0Mu 1yydBMrhzlI0fBNR131A/bV76DxOzPaH2vJomfcXNsb89Gz0xClU7gcgQMRmZ0z4rWHB0wFc 0IJxqF7whHr9rs55LAipi30+GYYGAFYAAABWAAAAAFAEC915AJAnLZaMht1gAAAAACAGQD/+ goAAEAHQApAn//4tlow//oKAABAB0AJQBP/+C915gAUAFqElDtDUBNUhgBD1cO98AAABAQgK ADWlVUQxpACpi30+w4cGAIYBAACGAQAAAJAnLZaMAFAEC915ht1gAAAAAVAGQD/+goAAEAHQ AlAE//4L3Xk//oKAABAB0AKQJ//+LZaMABaABdQE1SGhJQ7QgBgr8PawAAABAQgKRDGkAAA1 pVUfvn08xYUbE41esMgbNHO3mj95PC2shX2Klx9T4U9rpxIY17J6oyGy60EBlefGtiCeTzM1 8uWgzs9r+wUonO8DOMFzu+1gQ5/PB/e34uf2Wj7hAugHX6xSENIxzEtLzgguS3b10vcFkXzy pAUXeY98MIPZgz6LuFwbX7gRb7gV9n7125re2AHGnCTT8v2VQMqQrhz35N9tkvsIOOeg51Od 0hyE/JLpJ8kxmcKwUcv96ksMy+rj0JGrf8W72XzyM9PAquOG6CTYyqxugWXzBLojZqqGp8OH c2Vz/WDfZMjpCPEfDNfy/QUIFL+5OxOWBH7vU1ocO1oYHT7GEH0DMUNio/5gxafQzO9+ibW7 cAq6QAiox5cXGyq1gWOdku54OQJmWyxKNieGzNTkGdxGtMW2qYt9PheIBgBWAAAAVgAAAABQ BAvdeQCQJy2WjIbdYAAAAAAgBkA//oKAABAB0AKQJ//+LZaMP/6CgAAQAdACUAT//gvdeYAF ABahJQ7Q1ATWUYAQ9XDuSwAAAQEICgA1pVZEMaQAqYt9PpiJBgCGAQAAhgEAAACQJy2WjABQ BAvdeYbdYAAAAAFQBkA//oKAABAB0AJQBP/+C915P/6CgAAQAdACkCf//i2WjAAWgAXUBNZR oSUO0IAYK/BJ6wAAAQEICkQxpAAANaVWao0U6frbZDthdxnJcUKr/mevOR9F/RhNwNtFd98u B9orjvrfZNmDFndHcB1uDvrhYje5vKrVVHsf//vGXe5eWbpn+QoExFQBnoRE8L2qCmni2xv3 rqLj1HrUiYCNGyEQiFuIfC/i3t5UYFnQ3fd8ueq9pfmrl6MUHM4LAVLa6Q08EcfgOitmuVaw Zg1COl3+iWS+VqzzcZ73VleeYuIFOFX88/GWFZypxXiUxG90T+6uKOcrCwvNTjgqlL8w4bwN 1CN32/Kn8lI8teoBuXFHzyBcuHsUsMmQwOawnmN1rAnKM4SU4T4VluO3KEKjp+bo2lUNX3xQ XxgCMrzEhShrHbORUJ9huUNzG/5EgMPS4P1ALf8cLxub8A6GZHoGV17x+bYyb6Qi0uFZGGh1 0yDKRqmLfT4QigYAVgAAAFYAAAAAUAQL3XkAkCctloyG3WAAAAAAIAZAP/6CgAAQAdACkCf/ /i2WjD/+goAAEAHQAlAE//4L3XmABQAWoSUO0NQE14GAEPVw7RsAAAEBCAoANaVWRDGkAKmL fT5siwYAhgEAAIYBAAAAkCctlowAUAQL3XmG3WAAAAABUAZAP/6CgAAQAdACUAT//gvdeT/+ goAAEAHQApAn//4tlowAFoAF1ATXgaElDtCAGCvwU98AAAEBCApEMaQAADWlVtwJ7f4cUcRg PUJPuPIfuhvVmHA9QoTuIHWS44GCeTy88rtQhtXiFKh2XJitvK95H8wwyTvimWkUjJYS1rcY 1HIxPxUr1i5w5g4UvLcDF7U8UyltXEK+Ml4nz1zbJQ3GFAl2OlocQ54OBSOnZOzZx4zalhim 0voGpDRYj+MnnEjNlgiHOUAqpr0zAdyJtedjHHVUwYcNkee148bghXHUKse66rceeGfRPKEa nVSl5+5GrFUCYyQoq2KLXERbFnOttQp3TIjIfo80q/GFGBcfNvNOLj+uiBKffpFewnb0hjNN lFgQ/OKk7IH6lnOLrsMbUY0wjlbZ6lQk/JH1siSO6boAmfjNvQnZNe283KfTFw/F1P11MTzZ zX6Obf72uKyJM3dNXU0Qec6Bs8EDm5o0Q6mpi30+6osGAFYAAABWAAAAAFAEC915AJAnLZaM ht1gAAAAACAGQD/+goAAEAHQApAn//4tlow//oKAABAB0AJQBP/+C915gAUAFqElDtDUBNix gBD0QO0aAAABAQgKADWlV0QxpACpi30+R40GAIYBAACGAQAAAJAnLZaMAFAEC915ht1gAAAA AVAGQD/+goAAEAHQAlAE//4L3Xk//oKAABAB0AKQJ//+LZaMABaABdQE2LGhJQ7QgBgr8CST AAABAQgKRDGkAAA1pVdilinDWj4fxOBJA/kmcvd6f5+Uuf6ybqz0DnALbCcHJb38kxDKBhtA 9ika8LjNvcqQiLvVkAt0epiMnLOE0+rUd7KMXsDhCxeru32P7gPEJU77ch/yoH2FHvYnkvAB 8GFAypcOTX1CYuyQxBmJSHIOhMTRudNAhxQo5xyinjOKzbOy/ZiCKkYLETAgzDcfTf6CC+0i ZRWG9NlelSYpaPNo4TXdr48PEh8nlou8XQE+nNGCTYnJQIrGe8G6uw3SzHQ9Tfyp5DRcv2l8 B03kBk3wcgIFNWuRkRGfj9MH7zcI10t9nyeytJ3MLxM2iWIOOAhssifRAunULB1QVeSeMuW3 woMLpNXG4XioCMWLEvYj7lBR2SxzIKfGio8BS9Z6RnJj815W8aiLDf2a5j7ubTdDqYt9Pr+N BgBWAAAAVgAAAABQBAvdeQCQJy2WjIbdYAAAAAAgBkA//oKAABAB0AKQJ//+LZaMP/6CgAAQ AdACUAT//gvdeYAFABahJQ7Q1ATZ4YAQ9XDqugAAAQEICgA1pVdEMaQAqYt9PiGPBgCGAQAA hgEAAACQJy2WjABQBAvdeYbdYAAAAAFQBkA//oKAABAB0AJQBP/+C915P/6CgAAQAdACkCf/ /i2WjAAWgAXUBNnhoSUO0IAYK/D4cgAAAQEICkQxpAAANaVXJJa3oX0liOYbICO29gd519RA qQiknT2DpFW4I9Sgyk+7DRthTHD2o0AmUkjoSF76LAfBjw3ASWpI6NdHAS3DoWmRwUF2yW9L +V46Q/4Gejm2J3Tlo594nQuYoVRfrKcExQ1HXeY4hNG+h/+5Hu/Umn2tp9XnPD0xqgqKyiin UjCqnfvxUczmwKmBsmlvTUNkDT/BcVkOf1k3hJAWfd7GZz2meEAyFTZn7HHdd+bnCPRenuDw bJYrIV12EADzqIy0huMMDiHPFzJ4Z5pnBZvjnkpDqw/wF1trofpg+bHbglSglDR4oCFjv3Oo e1aEiSakyVOuTK7EmJXftE1aRQCt0qU3dFAPnnqOm6hYTaBFyb7kcdPoCmki/oCh9Gkp+scE xoQ0+oukvADzfpn+fE1s7KmLfT6SjwYAVgAAAFYAAAAAUAQL3XkAkCctloyG3WAAAAAAIAZA P/6CgAAQAdACkCf//i2WjD/+goAAEAHQAlAE//4L3XmABQAWoSUO0NQE2xGAEPVw6YkAAAEB CAoANaVYRDGkAKmLfT7ykAYAhgEAAIYBAAAAkCctlowAUAQL3XmG3WAAAAABUAZAP/6CgAAQ AdACUAT//gvdeT/+goAAEAHQApAn//4tlowAFoAF1ATbEaElDtCAGCvwYvIAAAEBCApEMaQA ADWlWLv9YiwwYocsAwvzAsco8p56mlDcAJCW0mFCy2tU84bkdWEjuCcgzPdQKj60lwBTmqKO vIz8oocmsUOtY6+ArGSHoe88Y9PaaEc61Z+8Qh0vgyhZYml42RjbmWYrr5i4hhlhwWKUJrXx 1Xenv78d6AA25sokVy1XvRH0FGTGQkuW6Ze1IjACBFm5bFm1tuTHIcG3bbVJ1HY76qLU8MRZ ouj+vzDv5N56xcCXma4gACIm3yD1wWP5N2DoVoYpcJrB78O76Asq5zd75Ba/uFFAx14Gknw6 nD30hFTC+ylbdNmFDhcZ6qh66Wxv5V7XbU7X6lszRJybhVFMlCxrsphe5k1HfM7q8t6B/a45 bDBZNDAd0X15NghMIgmIFXepkYQEFfZoQ/Yb+g4Gcg6LGG5lFxWpi30+dZEGAFYAAABWAAAA AFAEC915AJAnLZaMht1gAAAAACAGQD/+goAAEAHQApAn//4tlow//oKAABAB0AJQBP/+C915 gAUAFqElDtDUBNxBgBD1cOhZAAABAQgKADWlWEQxpACpi30+wZIGAIYBAACGAQAAAJAnLZaM AFAEC915ht1gAAAAAVAGQD/+goAAEAHQAlAE//4L3Xk//oKAABAB0AKQJ//+LZaMABaABdQE 3EGhJQ7QgBgr8BSUAAABAQgKRDGkAAA1pVhUKlXmkeKa8c7ZhEkPshrd/Ov1OfJYA78GIN2C g7Z2LBV6W4as+SSOHSSagR1xqpkX0006M2UkHMaWqf2uUF0fJ4pulOjY2p9Kuc2VtnWheRRh +OG2BuuWG/K/RYfZYEwXC6t35VG1iF2LbZrunf9tfGRi+ZxJwkJjv9DYMcrnr0f9LaKeocTZ F2m2FYanj3WQfKrd4T+pBo5r+SHroEVEsVYmt9Evm0VEGQgWIka0jK4zHWmgFRD4dKlY1DWO Xd6faoH0QN+akV6AcT/IRU9kbK6QR+GBm3EVRsRbE7WdDcAuqTUaeUmCuLVRgZ1r26PTqhJY JeWv4JgbBnb0oHzCSq2uD+5U28RNdnF/zg548ICltJNlzEcpfj75r2Io6kV2h55FXU6zc5DY sGb+L08hqYt9Pj2TBgBWAAAAVgAAAABQBAvdeQCQJy2WjIbdYAAAAAAgBkA//oKAABAB0AKQ J//+LZaMP/6CgAAQAdACUAT//gvdeYAFABahJQ7Q1ATdcYAQ9EDoWAAAAQEICgA1pVlEMaQA qYt9PpWUBgCGAQAAhgEAAACQJy2WjABQBAvdeYbdYAAAAAFQBkA//oKAABAB0AJQBP/+C915 P/6CgAAQAdACkCf//i2WjAAWgAXUBN1xoSUO0IAYK/Bu2wAAAQEICkQxpAAANaVZWpAsjkEe VBw8RxSDz+dLvZqAg1AuIyu0VjX2Za4EZSx2Jthpp7FoT305htLNw+qNY66F6MH9O7ieGeop wjJxpcQu3oHlCi0tvfnwkSxzV1FeJ2GXc98SrawHbYY20fqZrlakrdMylGqF3N9TcqdOiJ8L Fbb8sktzqtmyC/aIo4XNw7avWiAncJoGSZerIOfdrakOAlZn7ZqOymgKO31WhxfF08jGICjZ Z1QEw7mn1iHgIaPsbhcv6XMY2Asqc8bAAV5Xkyx1x0rdBi2KCzFG7BoIuTyl1D2wu9lsew4d OIQuv139RI7J8XWltuetE/vNvC9WfIXhHqY8KmDHJCOSSrvSSj/xkWYV09hwTyjhSm3Hx4fs H2OTo0DAiIHoqUqQTwCcZh63ATZsYi39Xio4CqmLfT7xlAYAVgAAAFYAAAAAUAQL3XkAkCct loyG3WAAAAAAIAZAP/6CgAAQAdACkCf//i2WjD/+goAAEAHQAlAE//4L3XmABQAWoSUO0NQE 3qGAEPMQ6FgAAAEBCAoANaVZRDGkAKmLfT5plgYAhgEAAIYBAAAAkCctlowAUAQL3XmG3WAA AAABUAZAP/6CgAAQAdACUAT//gvdeT/+goAAEAHQApAn//4tlowAFoAF1ATeoaElDtCAGCvw DsAAAAEBCApEMaQAADWlWaTMM/xfAJPwl33XeFDfuc+vXiZps4dChqenxWOvCCyyU22ZtCl/ aGS4lTJk5mGbIjEboN/pUCs3Qq0ijzmgY2HccQ995tqVP+KdN0UaaQdWXuLBbWgt2Y44O3kh mRdbGmkBeHqs5gGoZmgFXBmgab2wvUOzv4CUou+MKpJtHBCeLEPFoBtvqcnsGBWzviFdd+rd fh/ZLZC2XkRl4SZ5LuO8ZPiUVIDBre1GFCQl6mow8QbraoV043ztBJrpxYLlV8tNhGrdfdRc CGxKXhnVYV8csPsdAHraJ+gKcT9oo7Nh/iySiuT3zZcgPBVewDRDYIshj4ThoPdUi6eu2KfD vJxoXPfwiplp4Zr6a1cAxZd1vjPgCdPy/hqSgRnElIOZqD3n+MtUUBNLHFQWM9bbpJipi30+ 6pYGAFYAAABWAAAAAFAEC915AJAnLZaMht1gAAAAACAGQD/+goAAEAHQApAn//4tlow//oKA ABAB0AJQBP/+C915gAUAFqElDtDUBN/RgBDx4OhXAAABAQgKADWlWkQxpACpi30+OZgGAIYB AACGAQAAAJAnLZaMAFAEC915ht1gAAAAAVAGQD/+goAAEAHQAlAE//4L3Xk//oKAABAB0AKQ J//+LZaMABaABdQE39GhJQ7QgBgr8FHgAAABAQgKRDGkAAA1pVpNM63fz+0yF6pfVLp4LG8v XsFQkBSSKOgQxBv5ReOmEglPGMVH/m4t7b0pcGdsNymXAN8RfXurGgT+ag1wmiENu9ESLplP gYmfp1NpEXUVAw0snxOE2KpA0JoF1oHksAE74/PMPVf7I2+PFoFfPgW5RH4cCMB11q/aF5+u AYGI9CH+JbKv6ob8lyQItX465FJHT9gcy6WwAh9MbzbJPSwsOMbUIxo3MHMoSVkcvL+CFcbS qX4tw6iPozfDDIE8GTN2MO8CbcBZQ+WQ/Cq+vntIxFS3D+6uKVYkMDs5liM6VI+yubOPHrtA oauzJHBv9Pg84lA2Zux58nwetkm6P9mTqzIMpQFrvBLOy5JoyGk1kDzq0VYW0VRAqbnBfnpd lchgg9t/cqJ1z171qSiN2f9XqYt9PoeYBgBWAAAAVgAAAABQBAvdeQCQJy2WjIbdYAAAAAAg BkA//oKAABAB0AKQJ//+LZaMP/6CgAAQAdACUAT//gvdeYAFABahJQ7Q1AThAYAQ8LDoVwAA AQEICgA1pVpEMaQAqYt9PgyaBgCGAQAAhgEAAACQJy2WjABQBAvdeYbdYAAAAAFQBkA//oKA ABAB0AJQBP/+C915P/6CgAAQAdACkCf//i2WjAAWgAXUBOEBoSUO0IAYK/ClsgAAAQEICkQx pAAANaVaHZRpscA6fEqQ0HO5WQUi2fqzcfGGE+f79ZgdEWrh69RyKAi3mg5YeoUbm6l2s8Yt m3yLrjUoW5Vn+QtAfAHEBFq9uGLieZWcux9iaEESf5fFWPC6nnnqRaSOAAsMoXnJV/WuuEhG nq4v1t0wc3JcbJP8hZr817QkN2LCUFjrox1ghRM8KOmlWEZj/viLSZBkIUrsK78C9jCKKX08 2AlbBtCDjJJcVkUaOkWmmcB7j14Z4gwluwLimjDjs2cGc4wuR/Sls6Vx6NLbuiUKl3uTBB/y 9JKldvUc7KpCX8xKm48FIFZycyuyzuM4SmL3junKjWyiblkebzFbCrtLHMKH7ShfZkk24D8k IiWZYiwHc2KC5FgnyGzGUs68rjaztIhVbAiT0AehB5DHcFS4ND/M26mLfT54mgYAVgAAAFYA AAAAUAQL3XkAkCctloyG3WAAAAAAIAZAP/6CgAAQAdACkCf//i2WjD/+goAAEAHQAlAE//4L 3XmABQAWoSUO0NQE4jGAEO+A6FYAAAEBCAoANaVbRDGkAKmLfT7dmwYAhgEAAIYBAAAAkCct lowAUAQL3XmG3WAAAAABUAZAP/6CgAAQAdACUAT//gvdeT/+goAAEAHQApAn//4tlowAFoAF 1ATiMaElDtCAGCvwYP8AAAEBCApEMaQAADWlW494uVU6JAFE5ZM0M1Al/9TCwJWKBYU+wjv8 5FLKADkbEBYwZlU768+XYumhGAKcV20eyUGjMT250KrAszcTJnpoppaPJ5Y2Pr1fgqyzHmzK +wdHhF6QChFhpV6nODWpsIUQxsCLduopGtUFRi9jR+pFj0qEw2KJxagApkAkIYU++FzMZqon 5zQ8z+6YQ5CARneq+b4dsOiEgMSj8LIJQN57gzcGRf+VEZYkuoaafRtyQ5EdMVHnS+R9lwyu DCNj8tLLGmcXHPv5EZzJFtVWrJT0FjjnmpiU27RHQzIkO0cqb1h2oXtL2eW8r4jNeUFT24oD yc/7/bIMqhevV2e+ZCxhIphescWbNlRBopAyMdSdYPRRjT3fWTlZYr8xsv0xv3vrzjREzU5v MK7Hj8YejNOpi30+KpwGAFYAAABWAAAAAFAEC915AJAnLZaMht1gAAAAACAGQD/+goAAEAHQ ApAn//4tlow//oKAABAB0AJQBP/+C915gAUAFqElDtDUBONhgBDuUOhWAAABAQgKADWlW0Qx pACpi30+vZ0GAIYBAACGAQAAAJAnLZaMAFAEC915ht1gAAAAAVAGQD/+goAAEAHQAlAE//4L 3Xk//oKAABAB0AKQJ//+LZaMABaABdQE42GhJQ7QgBgr8BukAAABAQgKRDGkAAA1pVucYuUL BorL9+VNTEnOlcQD7wbmQDgW7Wa/XikcIQ0PP5rF0XbYpDWJ5DP4a09UmsejNYOgXhwzigN4 Fst0xk8gATJPLl/QbOVxfg5ojicowz4MNZ4EykHp5g/rTeVeBOU4zlvjj13mXC4iM3sGAyZ9 RO9NFBrxTu6+A2ITYvpQX/47OWF/gYPY51ybof+NUD1nSTuAiF3RSU3ettl9c6GEIlmKGql1 9++xQ1RqMfPXF0SSv7MM7aRkJzSE9DuKlAnUgInJCOZhsLlQwy/wMwvz1F6ErvfR84DMNTi3 8ye9FW5chCzGd1wOfmSqCrcp70HLEynUa0LGG0ebLPt8a57UQldU7u0k0ZuDVhokE5UE7wIN Zwb/MABvXuWGVohXWHtQwJCqE+IJcsCwVpZoMA69qYt9PjKeBgBWAAAAVgAAAABQBAvdeQCQ Jy2WjIbdYAAAAAAgBkA//oKAABAB0AKQJ//+LZaMP/6CgAAQAdACUAT//gvdeYAFABahJQ7Q 1ATkkYAQ7SDoVQAAAQEICgA1pVxEMaQAqYt9Pn6fBgCGAQAAhgEAAACQJy2WjABQBAvdeYbd YAAAAAFQBkA//oKAABAB0AJQBP/+C915P/6CgAAQAdACkCf//i2WjAAWgAXUBOSRoSUO0IAY K/CMegAAAQEICkQxpAAANaVcB64xDHOUBC0z16sMCH6affGWt8Ph/6sv356sdQgNXuDmSAMY Hy65xpQBX8+Rx6zO1ssrh2qY53KFI7wcTgARNCRS9BEPsxft04xp1BIyHwnkRGo4IC9ldCV5 +TJJZimXjjecYHYx6ig9lWLPN/J4HlHbYNcWwG0Q6AW0goPpK358LFYvJlrx/Jy0QqSjkHwd p4lk1q/+IVvbtr77OSCfSfrNM8MzzxA5rvfOdtJ8Ptjbsb9dwL0l4zhRpsYnY3iHT7eJJdvF dLV47gPp3ln0azR3URLcLRfaZwRuOmLL1FJQLyDvJOMg1kErRseRpJSpH/H7RZRVxwiooj52 9L9PB8Z1/L7JatNkkRDcXyUIrxRn6IlIC6SgoMT4q5VhMezoQk6+78pJUK2ddXzSe5lyPamL fT4JoAYAVgAAAFYAAAAAUAQL3XkAkCctloyG3WAAAAAAIAZAP/6CgAAQAdACkCf//i2WjD/+ goAAEAHQAlAE//4L3XmABQAWoSUO0NQE5cGAEPVw3tUAAAEBCAoANaVcRDGkAKmLfT4roQYA RgEAAEYBAAAAkCctlowAUAQL3XmG3WAAAAABEAZAP/6CgAAQAdACUAT//gvdeT/+goAAEAHQ ApAn//4tlowAFoAF1ATlwaElDtCAGCvwNSAAAAEBCApEMaQAADWlXPNG4D7+7jEm0OYo2kJ1 UNN8dKNDozYzYMcEq5ki4ODTk6T9e/vd1NnoK0k0CZJ23VEqSH5WhvJLtUtdD5gzt9+eDF1X V6VLM1w/dag9sBOkVGp1uJNe5kJz6ffYASz+u04ydu45U7Y/N0nDYieWsTrHpqtLGMCsmkGx HhrIY2tHkBltvdvyyS4ZovO9KZyfcMNmxXa1pd3rKgYNRHcI0N7mmittmF78n/V/wWdYGFs4 DfUfskaR3232y8rfA4bhtE8Uqm4w123daSkSyp9+Cgjp8Ehbvx5nZ2gLyVeB4Kofds4YG4dF PR9o4YSuwuO/D6mLfT6poQYAVgAAAFYAAAAAUAQL3XkAkCctloyG3WAAAAAAIAZAP/6CgAAQ AdACkCf//i2WjD/+goAAEAHQAlAE//4L3XmABQAWoSUO0NQE5rGAEPVw3eUAAAEBCAoANaVc RDGkAKmLfT7BqQ4AhgAAAIYAAAAAUAQL3XkAkCctloyG3WAAAAAAUAZAP/6CgAAQAdACkCf/ /i2WjD/+goAAEAHQAlAE//4L3XmABQAWoSUO0NQE5rGAGPVwVo8AAAEBCAoANadrRDGkAEv2 6tj48hZXHPRJbCFRFSA74omJJfJFoOnLrQsUlG1NjchE4N4oPXAHjr1BqaP7t6mLfT7TwQ4A lgAAAJYAAAAAkCctlowAUAQL3XmG3WAAAAAAYAZAP/6CgAAQAdACUAT//gvdeT/+goAAEAHQ ApAn//4tlowAFoAF1ATmsaElDwCAGCvw0mYAAAEBCApEMaQ2ADWnazFlJ1ZBe1xIl6dD8E1z PCUBxLDmENoxhqdqnkGDxX2MjsLnnc/A6U+e+QtfzYUW/UDCIKJCeX/B932ToeEG7Hepi30+ aMIOAFYAAABWAAAAAFAEC915AJAnLZaMht1gAAAAACAGQD/+goAAEAHQApAn//4tlow//oKA ABAB0AJQBP/+C915gAUAFqElDwDUBObxgBD1cNsqAAABAQgKADWncUQxpDapi30+h8oOAIYA AACGAAAAAJAnLZaMAFAEC915ht1gAAAAAFAGQD/+goAAEAHQAlAE//4L3Xk//oKAABAB0AKQ J//+LZaMABaABdQE5vGhJQ8AgBgr8GJlAAABAQgKRDGkNgA1p3HpnUeNHDO5LyY3/b1suK0t Tdk66SG0ExThDRkGe7quhcFIliGw6stO6vhkEpVjY7Opi30+GssOAFYAAABWAAAAAFAEC915 AJAnLZaMht1gAAAAACAGQD/+goAAEAHQApAn//4tlow//oKAABAB0AJQBP/+C915gAUAFqEl DwDUBOchgBD1cNr4AAABAQgKADWnc0QxpDapi30+xs0OAIYAAACGAAAAAJAnLZaMAFAEC915 ht1gAAAAAFAGQD/+goAAEAHQAlAE//4L3Xk//oKAABAB0AKQJ//+LZaMABaABdQE5yGhJQ8A gBgr8KApAAABAQgKRDGkNgA1p3OJ8Unyu+7jVPrEJMd33TDyFIgE1wD8b9QPps41pM+z6Day ZMOWz/QhQjNASIf/1u6pi30+G84OAFYAAABWAAAAAFAEC915AJAnLZaMht1gAAAAACAGQD/+ goAAEAHQApAn//4tlow//oKAABAB0AJQBP/+C915gAUAFqElDwDUBOdRgBD1cNrHAAABAQgK ADWndEQxpDaqi30+6awBAOIAAADiAAAAAFAEC915AJAnLZaMht1gAAKaAKwzlT/+goAAEAHQ ApAn//4tlowgAQiIEDcAAAAAC60MDw/uKQQAAAAAApoAAACVMOkqWj3egsOGNaFw/i2WjCAB CIgQNwAAAAALrQwPD+4AAAKaAAAAlaf2Cc+Xd7Vo4834eTbHwnnJqOi1JGOcWMhuX2EOTurC jQtg8mLqQ1WzFt6iRtfggxlxPehDcBN9OgbOMIqYZ6C5OWWZI7SRjFrpDThPzdORlMKzFa8w Y+LmzZnrrYGl2b2WIUUkqk+28I999x/LtBTbJ+4G3eB3haqLfT446QYAhgAAAIYAAAAAUAQL 3XkAkCctloyG3WAAAAAAUAZAP/6CgAAQAdACkCf//i2WjD/+goAAEAHQAlAE//4L3XmABQAW oSUPANQE51GAGPVw3H8AAAEBCAoANalXRDGkNm178T3SUcev3IdSZAtyPeHquLqhdd7dLktm 4cZfDOTgZMtQF+FI6IufhzQCqAgnZqqLfT6GbAcAVgAAAFYAAAAAkCctlowAUAQL3XmG3WAA AAAAIAZAP/6CgAAQAdACUAT//gvdeT/+goAAEAHQApAn//4tlowAFoAF1ATnUaElDzCAECvw ogEAAAEBCApEMaRqADWpV6qLfT7MGAoAhgAAAIYAAAAAUAQL3XkAkCctloyG3WAAAAAAUAZA P/6CgAAQAdACkCf//i2WjD/+goAAEAHQAlAE//4L3XmABQAWoSUPMNQE51GAGPVwOtcAAAEB CAoANaooRDGkakVeuYTrkM03rOkJK1vF6+e1Uij/r1vL7C/Hf251YzyKjyjuFEp7s5HKT8+t 6QAuLKqLfT6AGQoAVgAAAFYAAAAAkCctlowAUAQL3XmG3WAAAAAAIAZAP/6CgAAQAdACUAT/ /gvdeT/+goAAEAHQApAn//4tlowAFoAF1ATnUaElD2CAECvwoO8AAAEBCApEMaR7ADWqKKqL fT5fQwoAhgEAAIYBAAAAkCctlowAUAQL3XmG3WAAAAABUAZAP/6CgAAQAdACUAT//gvdeT/+ goAAEAHQApAn//4tlowAFoAF1ATnUaElD2CAGCvwDSEAAAEBCApEMaR8ADWqKEi8/iKXj7nL qMdXM98wAEVnxOS+bbHQHdwtN2w4wsW7OKLz7LB9BF5Kpxlk4v0M/ptKcAmEkU3J1ICvvKV6 15jrEq+7yNcgCAh3qr2jCgNQmjobVwbmeh/clymOeQ1ErwYINHbwrUC5Q2JuujlgB9O1Ishn AkQaqXg327N99lAAyd1UIks9CbXBO3jycckWjtqKl0o4XwsMY/5zfpzPnVLycvXE1s0aWKUF bKiA9qwFv1gvjDqgY023g5Da19tnRQUtlA/Meyh+zSxYk7GrC/hi5yi3B4bXuXwviyFSs2P8 6SLljOYJYGX7wI0hkxSLhnNCzfaeji0MnN/WusImXff4S4ocYLGfWoz0itbnZJkVoMNUMjXt eUsxDQol6vlCT3tH4ie55qKrsYqAwytSF2uqi30+qUMKAFYAAABWAAAAAFAEC915AJAnLZaM ht1gAAAAACAGQD/+goAAEAHQApAn//4tlow//oKAABAB0AJQBP/+C915gAUAFqElD2DUBOiB gBD1cNYyAAABAQgKADWqM0QxpHyqi30+d0UKAIYBAACGAQAAAJAnLZaMAFAEC915ht1gAAAA AVAGQD/+goAAEAHQAlAE//4L3Xk//oKAABAB0AKQJ//+LZaMABaABdQE6IGhJQ9ggBgr8Ouw AAABAQgKRDGkfAA1qjMJv+KGsLJML9CSq455MpDXDMC5x+nz8yUPnVLwjN76GHdyK9WFHCr8 6Uk3iX29YY3dpdZzjVtkO0puJiIW9+uQOVYaipKLJ3KoDHSkB+QKUcoelbSo+nY48gZ+xIqd 2KybRIhzc9utEyHQoM4q1K126+hp7QOFsgeeBPD9JzvoyRvNQRBgQy/yC8fqNwnArcP78Cy2 3O/wTc5QJoYuh4xxnVpKIF8Xg4ZFqw8j7Jenxp5Sr3OiWdY3bXKbBMhbEcd8RHIg4KFsXP29 UHuL7CL/0UC1rCuzyvnfTwDyyy0YWlP84Be5rvQTiVAcZ00VcflaRbdyF4J+Shfnnf4dAS1b Dk29qVQxeTfyK4B2EwqPLvHTwfqw2gL10/j7d1vyPkSmMCI4bAsGHPrekp0o+brUqot9PvVF CgBWAAAAVgAAAABQBAvdeQCQJy2WjIbdYAAAAAAgBkA//oKAABAB0AKQJ//+LZaMP/6CgAAQ AdACUAT//gvdeYAFABahJQ9g1ATpsYAQ9XDVAgAAAQEICgA1qjNEMaR8qot9Pk5HCgCGAQAA hgEAAACQJy2WjABQBAvdeYbdYAAAAAFQBkA//oKAABAB0AJQBP/+C915P/6CgAAQAdACkCf/ /i2WjAAWgAXUBOmxoSUPYIAYK/BqkQAAAQEICkQxpHwANaozxYJaIfRJWz5cVLfUanCctT0M M4OG6Vf8CKo0aNNf1QhZmGD38o/RdQaUoGkQh8tGRRCkziMrWCvpcrbwN+sEtZyHgP5Tj0Sp i20EZLrCrQaR3tnppIUJ2n5y9pu0H0HgY5EYbz764F1DMS1Uhy3siiUKmR+OyYeOrgforNfS t6NBnblSRYr11hxJy0nbQUpif+L4lOuU2WKIGDwhs1d0D+Y/WjI5VHfmVpFVo4SbLTzMV3q2 hdYk9q4p1F/XcefE67B4FbM50H6zg6AXGsswboqmLeECLN+e1JzLn2KP8T7zDJGjbf1vJq4O s4gacAau4kl9xRISx21Ls8lzs1zmHt6etdLihX2VvbMXvZMYBLLakHEtcIMK7V9GD/Mevvqh sbFrJ5X1rqAMcEBMLgIjyKqLfT6+RwoAVgAAAFYAAAAAUAQL3XkAkCctloyG3WAAAAAAIAZA P/6CgAAQAdACkCf//i2WjD/+goAAEAHQAlAE//4L3XmABQAWoSUPYNQE6uGAEPVw09EAAAEB CAoANao0RDGkfKqLfT4oSQoAhgEAAIYBAAAAkCctlowAUAQL3XmG3WAAAAABUAZAP/6CgAAQ AdACUAT//gvdeT/+goAAEAHQApAn//4tlowAFoAF1ATq4aElD2CAGCvwKAYAAAEBCApEMaR8 ADWqNHD42xuczlWXfdi2mo6ylGQHL9qsuD6Y7mJmZi8HkXnzCHLmYF/3GQr9Cp9saBPCzStX RzxKx8/0IWlwcRZ/a21pgGK0XzWMbJEu+tfNiLx3WDWRYcj1yWT8EaTAONToJs67L2Awrci6 UTCGdKTdS8ufqS7moMWZ42+DEhLxJKzgzpF5hB6//7i5/CCMGhr2dNC1oyCdrQo1KDmXqGCk LclnKuvYV+88B5pe8G7mAG8BgjP4bDVgBZxm4BTxs+JLRKM9CBY4oWUM6ep1JiFVA50ps4ud c+Qs61co1ROrdrneY/GY84ZanOyYQEzaVHTbJzudgZD8VzG2bD9a8l9UcstQi2zLxlBhr+FF mYQ6+ByQB5bCeSpPzSH70FxocHOiwpInovogZXWLYQi/UawM/xyqi30+okkKAFYAAABWAAAA AFAEC915AJAnLZaMht1gAAAAACAGQD/+goAAEAHQApAn//4tlow//oKAABAB0AJQBP/+C915 gAUAFqElD2DUBOwRgBD0QNPRAAABAQgKADWqNEQxpHyqi30+BUsKAIYBAACGAQAAAJAnLZaM AFAEC915ht1gAAAAAVAGQD/+goAAEAHQAlAE//4L3Xk//oKAABAB0AKQJ//+LZaMABaABdQE 7BGhJQ9ggBgr8Cx1AAABAQgKRDGkfAA1qjRID1Lyk4hgxrvtp1NJ+hBTxc63gcs1bm7FTofb 3qIVsK4r4TRdsfFD2PGNebFfHm9GsVMUKJGfW5dREw7jo1AXdPWkAZ059BBVoW4uwX/oZLGf Bz+RvuIaXLag5XHiQNkiDgB+2qZleH9bvGFyWbYpXinGo9zcO+M8ucnh1Tv1fiWNUKsPhohy Ugfhft9lKdGdQ8ZJ2b+VPKkAODorL8Jej+Y77sqpLPI1kiUaDcJUf/V9eTg9ugJdCdqbD0zb D448HkeynkR5UepS8FJ2wSFPCCxXrlA5Uc7iLSTZkTuujFsz8A8uR99pDi8RA5O+j0SRnbID NHID/UpBNy8y7zjEYC19uvXYwozdV/SMYFnIUTCLgHoAYWbSFaTh3Tnapm98fWusGhG2fMST eekfXfqQqot9PnhLCgBWAAAAVgAAAABQBAvdeQCQJy2WjIbdYAAAAAAgBkA//oKAABAB0AKQ J//+LZaMP/6CgAAQAdACUAT//gvdeYAFABahJQ9g1ATtQYAQ9XDRcAAAAQEICgA1qjVEMaR8 qot9Pt9MCgCGAQAAhgEAAACQJy2WjABQBAvdeYbdYAAAAAFQBkA//oKAABAB0AJQBP/+C915 P/6CgAAQAdACkCf//i2WjAAWgAXUBO1BoSUPYIAYK/APawAAAQEICkQxpHwANao1HXe2rw2X u3aTUmwahHs2Q4czRZPkgGwf0Q1VbnjHVnpfJHdIWl9DoLN8QbUG4COg9xY2EtBK5g/5nbi+ t+9UAeLG1lmPjSKdPFHnbZ6E8L9HZ0zlPKQhgqldjY5zS1D4jDTOJzEP8XEvkRTL54S95T5/ mzIGIg8me1J9zYo0muciHUrNTMo+ZtrdHgOdlhLbZFvMZH7+pnvVa/KnhVyD1eUaz29qbapt o2GL2SQijwbovtldxfr1hHrciT23dlv24ed1JaR6WP2rkfsCqj/Ezg9gchSRWOmNMGZecXm0 fUQH1okbEV8v9UdzaKtHVLHV2n5Sp1K36LHUqR9YLhX/zAV0tG/d8L1djIMqZwp69B/cZs2S gHYW5zkMUiSUM4+73bkgRF1gs8XvhScf6eli+qqLfT5MTQoAVgAAAFYAAAAAUAQL3XkAkCct loyG3WAAAAAAIAZAP/6CgAAQAdACkCf//i2WjD/+goAAEAHQAlAE//4L3XmABQAWoSUPYNQE 7nGAEPVw0EAAAAEBCAoANao1RDGkfKqLfT6zTgoAhgEAAIYBAAAAkCctlowAUAQL3XmG3WAA AAABUAZAP/6CgAAQAdACUAT//gvdeT/+goAAEAHQApAn//4tlowAFoAF1ATucaElD2CAGCvw Ci8AAAEBCApEMaR8ADWqNfvroSPYeb+DaNB8oTS+okTpJv0vfSRxt1RFLnROJvCGBiKpKsTZ AzdlsEukJFp3RXLbDovHh98gfT+Qnxfgyl0gcDWbAkmWVBP0BaA5dsRb5bKMia1McFLWEC4R apgt5ugAvFKKm0LlqNxEft3GHKbQIVOKqkVKVRvC6N3iNfEfg55T+JdSKZey9+Lmf8bSXrxq 2AgkLMby1qdISOAzvPyBDrMM3zZlP4ZPndtEr91S1QoO//8tePibiTXiIA9RG5hi2CGCBGiy t7Lg4+AgXSaY4S/0dNXaPfReaAJxMYHjAg8uMvKg0yUDWqDy4B2AZTT/y75fX6GmjJKuVrBQ p3HuxjZ/tw+AaDCk+YyvNGmQoXOAA4SwQb4RBylpdKtSg1wcVy/yCyrocc1/lbOFbX2qi30+ H08KAFYAAABWAAAAAFAEC915AJAnLZaMht1gAAAAACAGQD/+goAAEAHQApAn//4tlow//oKA ABAB0AJQBP/+C915gAUAFqElD2DUBO+hgBD1cM8PAAABAQgKADWqNkQxpHyqi30+hlAKAIYB AACGAQAAAJAnLZaMAFAEC915ht1gAAAAAVAGQD/+goAAEAHQAlAE//4L3Xk//oKAABAB0AKQ J//+LZaMABaABdQE76GhJQ9ggBgr8LPWAAABAQgKRDGkfAA1qjZr/Bz5+7G9GQqfLTQ6owSV JZXUPwMiahDgJzNsphzNPl1u27wlH3iK2wABew0ZeGbiS3JQKo0zmRBRhJLhdSMhAsMET3C2 4Uybkg4eIqahP16wrOTmAOwUHnZ6+BhAYo9GedFxekUlaeX7rnRePO1Wag56bX/Cw1GCIvqt zmvbphWrVbhuDHMZ4vzgPMIrZjWmJLKP9WtEH4yGoxXTcOfJ+X9Br/HC4PDfp/yJ0vYXUSNh SMWVjyyPRxmoBKmiyMv2tgFQevf3jwx1SlnWrAfzQo2imbokvxu+57hPLapdFuOjpTnbOxLd NPMPqDJhCpu4y+iMAA4oaKJWZC5btzhWkgMblqEgSso2pketS3j74VFT2n9ofdjRQgPeEaxE XbGd654Z2tUHC8jTE7JSBF+Gqot9PvFQCgBWAAAAVgAAAABQBAvdeQCQJy2WjIbdYAAAAAAg BkA//oKAABAB0AKQJ//+LZaMP/6CgAAQAdACUAT//gvdeYAFABahJQ9g1ATw0YAQ9EDPDwAA AQEICgA1qjZEMaR8qot9Pl1SCgCGAQAAhgEAAACQJy2WjABQBAvdeYbdYAAAAAFQBkA//oKA ABAB0AJQBP/+C915P/6CgAAQAdACkCf//i2WjAAWgAXUBPDRoSUPYIAYK/DpHAAAAQEICkQx pHwANao2+eUaJVuiD7HteQ7ZBd3SOU0M2tplT2XQYNes6LLLy57/3ALAmqWRzPI13Jv4s1X0 760bg+3gqNIVUhPq6Ix8Y/i1qwyspnDT4kkRXs7iKnEH15Zjv1xHWcCPw+EtYAsrga3EFKoI 5XUMjejpvNySMoiMk91Vro38aEj1mDZ7AjTYThWGl1x0ZQgI6pKvD9TA6uEV0rgz3ZTE+Xud 4VptEunMB5Q/FHDqQUXRND0h+d5JpDcTa/2W1YR2WOtdU65NLF9JkkpefUiJwritOjCp+uJc j9LJfUSJQZ8u5nZ3oKKx9mq31FPmnSa1AMYMjyk7/DLyRxFhWDnzIGw9914vpOcOukuL43nz 54o2TaWHrxlFHa8/np8lWCO5eMEWJl9ytew27alusNxo6J3twB8+ZqqLfT7DUgoAVgAAAFYA AAAAUAQL3XkAkCctloyG3WAAAAAAIAZAP/6CgAAQAdACkCf//i2WjD/+goAAEAHQAlAE//4L 3XmABQAWoSUPYNQE8gGAEPMQzw4AAAEBCAoANao3RDGkfKqLfT5JVAoAhgEAAIYBAAAAkCct lowAUAQL3XmG3WAAAAABUAZAP/6CgAAQAdACUAT//gvdeT/+goAAEAHQApAn//4tlowAFoAF 1ATyAaElD2CAGCvwBYUAAAEBCApEMaR9ADWqN9jfOyONWC92lZUlrqyMZM1clNvw6iRW2Jla SK/M+JsfhMszDsdZjbnB39gjoNDpfkUbTNKRzSj6BYuPMDKxN0dh1FoTpwe9Gj+Ye6AY9aux iY523VQiJ7a8mOKbynxm4MbpdBnm5rTw7k2w7mOwblMaYTTRH9GTEEK5ptHGae1MhliOJSgF Brt3tHRXwaO8veKfOrBaMGvylbsOz6hxsb6kJu+jsvQk2Mq9wRbQ8e3Qxy+HwJ9OI1OzlDh1 lTcUwuH6VSS0vXM/kOKM+x8/cb9kQI3oKFiV6uIojsuMqcXXlgfNxV71mFX77vkbngIMo1dZ IgtRtDioKL3UtnuLXndHa427ceYfxW/cUadIHZRveCMA+nhk1OCZGHN1fqpr5v+7s3/R1I78 Kgocc1T3JiOqi30+01QKAFYAAABWAAAAAFAEC915AJAnLZaMht1gAAAAACAGQD/+goAAEAHQ ApAn//4tlow//oKAABAB0AJQBP/+C915gAUAFqElD2DUBPMxgBDx4M8NAAABAQgKADWqN0Qx pH2qi30+GlYKAIYBAACGAQAAAJAnLZaMAFAEC915ht1gAAAAAVAGQD/+goAAEAHQAlAE//4L 3Xk//oKAABAB0AKQJ//+LZaMABaABdQE8zGhJQ9ggBgr8IsdAAABAQgKRDGkfQA1qjcMGEuF zErDKg1N0HYl9NfOeLpOdh2esSYrTtrXoJC5mDr2dWpzq8ObL4lMCYlEzewQSj474LK0XdRz 8GMBQUoR/3ngEGQhcc3nwuYlIGCyk7mKQ7K9rkiv4vkZzscu8waaJGaqiJTtM6GCq9iDoWc5 9lIjANaZ2DykFYgK2Wm2iF2H6dK71baR0/k6Wjbaw+vezqTb7gz+eq8fSDgpJU7a1v0burjd qXWkJDdm1AxEG/O1KdPHg84BQreHOv1zDYgo4zKz8k3po+ke9Y3ZD3ZpYYsAz0YKwRXLMWmk iKJaFSeOuA91AcTUNSks3qlxuK/E9xaw1NVIKlPPV0pDS1+lJXwVbreScTT4Kmcgy4nb0mUK NCv0r4BJDPQxJbcak8ZCQ6itL2sypcavFcaqTQo3qot9PohWCgBWAAAAVgAAAABQBAvdeQCQ Jy2WjIbdYAAAAAAgBkA//oKAABAB0AKQJ//+LZaMP/6CgAAQAdACUAT//gvdeYAFABahJQ9g 1AT0YYAQ8LDPDAAAAQEICgA1qjhEMaR9qot9Pu9XCgCGAQAAhgEAAACQJy2WjABQBAvdeYbd YAAAAAFQBkA//oKAABAB0AJQBP/+C915P/6CgAAQAdACkCf//i2WjAAWgAXUBPRhoSUPYIAY K/C/0gAAAQEICkQxpH0ANao4DGHPhTNnpL8vR0Z9+A/+W3BacdjlY3ILzdOxFGiynjd04PNp Q9RDAMvT7uPjHP4ix4EBGaDvOjXZqBm/XyhjNgYx4cbPuUjpRMXvrS3SwIEX5lFUiKOWDTgW E15BG//sfkYZo0ls8hbSlVfWY+prckksK030rB8abmZ2abb41YuM3XEIdaZmJmO95D03jjEy dXuzV3iAVbDprp1EzvteJGfbraMat7D0jj9TW2wnxxQD/8Gi81RkGY8Vbv9ERaGPrfdEUyIr TaJ76xVk4vdp717Ay/xblnBB28h0zTn0SxVZQjbtNgq/kB9GAPj7eAj3uTnZePpsS7k3tFk8 ZmnBsuhgUWoaOS4pUZarlpVUxskZt9J5kzabKU6LkBRayOadKhwtpPJYsS269CXcs470wqqL fT5jWAoAVgAAAFYAAAAAUAQL3XkAkCctloyG3WAAAAAAIAZAP/6CgAAQAdACkCf//i2WjD/+ goAAEAHQAlAE//4L3XmABQAWoSUPYNQE9ZGAEO+AzwwAAAEBCAoANao4RDGkfaqLfT7QWQoA hgEAAIYBAAAAkCctlowAUAQL3XmG3WAAAAABUAZAP/6CgAAQAdACUAT//gvdeT/+goAAEAHQ ApAn//4tlowAFoAF1AT1kaElD2CAGCvw+8wAAAEBCApEMaR9ADWqOCI+85EWh4Wdrsjv2BtD KaSsXaqiQALh1dqF615hWcIYTXTv/WAj98PUUXIKPPb6BEPTzCc9juPhevvc8uFx1pcAIlY3 Hty/7Agj+sGLFVOvNqg838yn+vms405XYSYcyVSZaMzb95TpPMy+lXOu1GlvsYt+fYVum/dO iZwLtXbuHkI95v/JVl9EQlhrkM7XlNYbFvUrA4jysQ/PxK/q/mbPGNnFj1TrBvoZtCmrBDCp VATRic8eqYg9YDoQaGGZXq9xboWBQZbLE8qlYklg9pnOyqYD9ZocgyXlcp4HgdjAtKLH7olI 9SruUmsKR7ULh2tUgUXf6XMsSnRiWKYWNAjiKAEbT7H7aAQd9P0DYtGL3sLVhQPOKqYHh+VV vWgCg/QSM3UZTV6EZCIJNbxh492qi30+XFoKAFYAAABWAAAAAFAEC915AJAnLZaMht1gAAAA ACAGQD/+goAAEAHQApAn//4tlow//oKAABAB0AJQBP/+C915gAUAFqElD2DUBPbBgBDuUM8L AAABAQgKADWqOUQxpH2qi30+llsKAIYBAACGAQAAAJAnLZaMAFAEC915ht1gAAAAAVAGQD/+ goAAEAHQAlAE//4L3Xk//oKAABAB0AKQJ//+LZaMABaABdQE9sGhJQ9ggBgr8K8LAAABAQgK RDGkfQA1qjnEKNJa+YK5DDb/YSiSFHdOidsW2y6lHDAF4ZawqoJSBP336cBnzYNmMtzd2IpJ Y0GjMM9L/hU9nEqgr+FbBsvlNdGyddleKeCeiBlGEaqFsJLDerqhzqt3TsXXKRTzChIrzusf vFcZTj6Q1SGixvAieeRAfQhL2Sp7UZY8dsdkFjsfxtN584br8wPPtEXDYPhcWNJ9qOapn1AB 2oq/ThjvqIFgsDd/kOEnAhW4/ju7kWmc4H7oeBpJufhoWWl/JC4CC1zSxQcYpSUsmmzBzahI WxtwkB9by7maF6QUriesxeRjuD1sbK1wP0pjfk/BfMhOHmEGzuL+GrhFTr54J6QD0wKIDD1J FcS1gB7WYuUv5wxaJxFb1FA2B+N2nNYFVVg9RxAbog//bFJAIdNxczetqot9PvRbCgBWAAAA VgAAAABQBAvdeQCQJy2WjIbdYAAAAAAgBkA//oKAABAB0AKQJ//+LZaMP/6CgAAQAdACUAT/ /gvdeYAFABahJQ9g1AT38YAQ7SDPCwAAAQEICgA1qjlEMaR9qot9PmldCgCGAQAAhgEAAACQ Jy2WjABQBAvdeYbdYAAAAAFQBkA//oKAABAB0AJQBP/+C915P/6CgAAQAdACkCf//i2WjAAW gAXUBPfxoSUPYIAYK/CgigAAAQEICkQxpH0ANao5zm84+rVtUQcKgqmcRCWoEqhd7BP9kSB0 giX4CEa2+Ijo8eu7RbUvoz4Lcz+0cSTvr6x97qWIPZsN9U+CQFZyTO4a2sHB0HhLaNNk/Vg6 MnFse6Rvmsvd1UhN9xAHck7QUBVM5ma6dCsu7fRbXihQj7wrriwetQnvGKoz+E8rDjEH6qeB pjUAJowZNDod4Sm5S+a6AJM8RyUXJaI2RZhFSt6iZHMKHTcbfJN4o/ZaWN23qpxwdmnBarvA LkHMpR1hU87uKOjQNaZ+3eIpafxxrL5/a3lDcxStbQFrVFWAaM4dYhommCzZuZGvft5MxfS4 ee9W8U+37gPal1pQ5wb4LH5si/YS+MJUiWamnsGkM2ovI5/Hp/oJXM5D4JAx9o01l4yNQXe5 OX5RdD712IMsWKqLfT7OXQoAVgAAAFYAAAAAUAQL3XkAkCctloyG3WAAAAAAIAZAP/6CgAAQ AdACkCf//i2WjD/+goAAEAHQAlAE//4L3XmABQAWoSUPYNQE+SGAEOvwzwsAAAEBCAoANao5 RDGkfaqLfT4/XwoAhgEAAIYBAAAAkCctlowAUAQL3XmG3WAAAAABUAZAP/6CgAAQAdACUAT/ /gvdeT/+goAAEAHQApAn//4tlowAFoAF1AT5IaElD2CAGCvw2bYAAAEBCApEMaR9ADWqOWOR n9k1VrPq4PjaWSyxXef3gXxdYV3suJLhhdwXNwiwK0FazXK6uRBzR8zJqAhRoHQslIWPNnl3 bgtvFmKAcP1XIw/HoXUj8cw1x3SFrxYC0MfePNREzXUwaIcmzODVCY7gTYxNweJlWvZIq/Yv yuIGwSKRlJfKPnF4LN7YqjubYWexAUSMGEt2lOn9TejUQsi+DtQtdyW0kqLAPFvR4nHmt+uf F8A2h5hqJMveHJrjx1dTE3DU8cm6/GW/Q5PZPlKAZtVhtEDIstzVwxlfT2GmCAeDnVJ+xb5+ sOMOaZpXNLImoouY6G3Ik7KVLXmxYBZFQRe+YlMb8Vazmp/9ZqrSAVI0VkEHxx9AxUING1nm +zKvk2I5aktVn6QerZ/0RQyStw7R6FBeVIB9RG2q9UKqi30+w18KAFYAAABWAAAAAFAEC915 AJAnLZaMht1gAAAAACAGQD/+goAAEAHQApAn//4tlow//oKAABAB0AJQBP/+C915gAUAFqEl D2DUBPpRgBD1cMRaAAABAQgKADWqOkQxpH2qi30+/GAKAGYBAABmAQAAAJAnLZaMAFAEC915 ht1gAAAAATAGQD/+goAAEAHQAlAE//4L3Xk//oKAABAB0AKQJ//+LZaMABaABdQE+lGhJQ9g gBgr8F8dAAABAQgKRDGkfQA1qjoGZ8sFrWp9BZ7AviPlROV6BMpsQh3iz1GjPN3XhVYOs87P Ysiki/5iMD82yjJky6FPMWlbEZYPcEpIHGJf8Vdj3UTV6d94+WWkN//ycv1CkyIdTwQ27vWm 1WHlTf8SwUOipYLkP8F1eTmaIyizkcMzJUYGBoBQ1tuGCjwot6z7C77dOGa/WzRWZh/YToKp RDBz6e9H+XZbgVyMUocjbtlYgIQQiBnKBhJF6OyIw3f2R4GFIHacYJiTuLTAMc/Wlk8vsr3b qLvkHv6A2MQ0+kpachz/WkvihGXx2OmSE2cvPgKPbleG0ZyuMBS8Pu5TnruD2kenACW5xthh Sf5176e98Ua6/SMUEC06YI15+nt0HKqLfT5PYwoAVgAAAFYAAAAAUAQL3XkAkCctloyG3WAA AAAAIAZAP/6CgAAQAdACkCf//i2WjD/+goAAEAHQAlAE//4L3XmABQAWoSUPYNQE+2GAEPVw w0kAAAEBCAoANao7RDGkfaqLfT5pZAoAhgAAAIYAAAAAUAQL3XkAkCctloyG3WAAAAAAUAZA P/6CgAAQAdACkCf//i2WjD/+goAAEAHQAlAE//4L3XmABQAWoSUPYNQE+2GAGPVwWIIAAAEB CAoANao7RDGkfaih2nxQvjscGJb4e3HT/XiSS4SLybNbuKMGoymfk35fomlByR5NJ4OBnCxq l9/K56qLfT7o7goAVgAAAFYAAAAAkCctlowAUAQL3XmG3WAAAAAAIAZAP/6CgAAQAdACUAT/ /gvdeT/+goAAEAHQApAn//4tlowAFoAF1AT7YaElD5CAECvwjJYAAAEBCApEMaSBADWqO6uL fT73CQIA4gAAAOIAAAAAUAQL3XkAkCctloyG3WAAApoArDOWP/6CgAAQAdACkCf//i2WjCAB CIgQNwAAAAALrQwPD+4pBAAAAAACmgAAAJb5Uiy5XnLAZsiOfXv+LZaMIAEIiBA3AAAAAAut DA8P7gAAApoAAACW2yfuBt3gd4VF7fcF3Urn8KJ3J0/LUgHrygsujzZd1dz9J19NwCx20iFM w5ZjMAapOglr3ejBVsh9+EW06j55qPEfAEFVDAedZV2OV/eKTNIrvy7vOxNQ2Iy/Ze0ivyk0 zE5hNHRTXEGdDQVJd+mXccgVXirhNOL7rIt9PuhmAgDiAAAA4gAAAABQBAvdeQCQJy2WjIbd YAACmgCsM5c//oKAABAB0AKQJ//+LZaMIAEIiBA3AAAAAAutDA8P7ikEAAAAAAKaAAAAlzWw R0D6uOCXhVHi0P4tlowgAQiIEDcAAAAAC60MDw/uAAACmgAAAJfIFV4q4TTi++wrGgFL5TPG Juy/KFSxK8W47FJxfIj06uptRJtYDWjMCqPUuHQ19331XgsbEbxP4N7ltaOblpadBP55biD/ dBts/D6wSAM9hUs9C43ZdUMOr2FX0DDEgjcGkJlKrcfqeoFA1Bh0gG/xE4+8oK6s4Dyti30+ 6cMCAOIAAADiAAAAAFAEC915AJAnLZaMht1gAAKaAKwzmD/+goAAEAHQApAn//4tlowgAQiI EDcAAAAAC60MDw/uKQQAAAAAApoAAACYlFPE9SG9tSi116z2/i2WjCABCIgQNwAAAAALrQwP D+4AAAKaAAAAmBOPvKCurOA8yg/w0VRng2OcePE4bTUqoT1s8y3Pmw37N+QQgteB0pWzEhBo 3Vnf7JmMthbpqp8QIixIwdxf4T9phz2u0yBM6msN3mULqSuSnxaudmDpwLOouG844nNn+lh4 EmYnXzQsMaFhxeG9J2hwevebnvT8Ga6LfT6TIgMA4gAAAOIAAAAAUAQL3XkAkCctloyG3WAA ApoArDOZP/6CgAAQAdACkCf//i2WjCABCIgQNwAAAAALrQwPD+4pBAAAAAACmgAAAJnHwaIQ iaVEvap9X2P+LZaMIAEIiBA3AAAAAAutDA8P7gAAApoAAACZcHr3m570/BmVlGK6i8KnUjmV yTx9govJK8/jCuqTxNg/GolvmJ1UWtoUMVxMJY6JNMq37ljILzICjt0UFEt2ilY+zJ92pyP5 Jpu32blKgkTBJwFSkJcYZY/mQGwR5ua7M5Tj8kBKrmBSKtVag8ix+6Kjengewy69r4t9PgB+ AwDiAAAA4gAAAABQBAvdeQCQJy2WjIbdYAACmgCsM5o//oKAABAB0AKQJ//+LZaMIAEIiBA3 AAAAAAutDA8P7ikEAAAAAAKaAAAAmkrceHIv1m1YHlwCtv4tlowgAQiIEDcAAAAAC60MDw/u AAACmgAAAJqio3p4HsMuvYoBcPPB9kA9fBZiR3rOwjraFKvkVihjhPxMsOaWzlbJzaQesNPH Cu/0/IIdOdfG5yVSdXRyzSUaPMO27mbv8VfW+1V5sLEJHhbizac+dnROBBQLbxSap2eaDrw+ 37up9SfKFgMdDxSNQGy8F3qFIaGwi30++9oDAOIAAADiAAAAAFAEC915AJAnLZaMht1gAAKa AKwzmz/+goAAEAHQApAn//4tlowgAQiIEDcAAAAAC60MDw/uKQQAAAAAApoAAACbry997gsO FTJ/A8je/i2WjCABCIgQNwAAAAALrQwPD+4AAAKaAAAAm0BsvBd6hSGhrzUfNPNBkHhmkmfL OBwZUKCaxUDcGasKUIeIIua3COHiKNWSxD9a/WdcJKpNBtS7dXci5SmoEkwcKsF0JSjuzR7M plSCQl5aPb3Y/RD2FDRp8voW9XKnCTJO1Mo5wk10m7rqdZ840pF8D3w3UdYdtLGLfT75NwQA 4gAAAOIAAAAAUAQL3XkAkCctloyG3WAAApoArDOcP/6CgAAQAdACkCf//i2WjCABCIgQNwAA AAALrQwPD+4pBAAAAAACmgAAAJx7mRhOsWChTmwHHan+LZaMIAEIiBA3AAAAAAutDA8P7gAA ApoAAACcfA98N1HWHbTppEVkaBi6FanTwkyddPgGQHg9vrWV+6loSoSizutho9AONxUoONVS LPP3/jFQp583PR8mFgXdreBlzFyduIL/HUfHnbjU1yUpoZTzW6hjebzCj6RO9iTdgpsLeNPx HIa7rbjq0Twl9p+igvexUTZasot9PgyVBADiAAAA4gAAAABQBAvdeQCQJy2WjIbdYAACmgCs M50//oKAABAB0AKQJ//+LZaMIAEIiBA3AAAAAAutDA8P7ikEAAAAAAKaAAAAnR+P7oST5dSp LMDa6f4tlowgAQiIEDcAAAAAC60MDw/uAAACmgAAAJ2fooL3sVE2WsXWculjCoTVvKHJdJUq 0h83V5jk2Be3noTaeAkCuNEou/xXR0jlgjtdePbL1xGi8pOoWTQCqC1Y4SsXzKpwaLiW0I1R QkafP4WBVR+5o6uM0n49UGhW8SZYuC42c9LO2Nk/lbCX5xHGj8RnCgp/QZyzi30+2vIEAOIA AADiAAAAAFAEC915AJAnLZaMht1gAAKaAKwznj/+goAAEAHQApAn//4tlowgAQiIEDcAAAAA C60MDw/uKQQAAAAAApoAAACenqS39V+Y0oDbZQev/i2WjCABCIgQNwAAAAALrQwPD+4AAAKa AAAAno/EZwoKf0Gc2C5gaERSEVJMuno8c7qKl2DTaA3Io2ThGUqXE3sYoHqFrtWZCit/rzCH 44+sOyjliLH0ZH4TPWvTg2jpddzYfrPG3JmIX94xwLvhNBjTzG9/equHc+1VsOb8De7PmhO9 DJcwIksW/DLmC1GE01ngErSLfT4aTwUA4gAAAOIAAAAAUAQL3XkAkCctloyG3WAAApoArDOf P/6CgAAQAdACkCf//i2WjCABCIgQNwAAAAALrQwPD+4pBAAAAAACmgAAAJ90lYTKJv2ARXu0 syj+LZaMIAEIiBA3AAAAAAutDA8P7gAAApoAAACf5gtRhNNZ4BI0dFU+hciwSzfAyW9EuZlg 6y+Tlwh35rCNXhGkfMKVeO+Tg+oaAPWw98DV+i7srujB+tZ/vVWkdRZQFQjcnWuT5sQoZZdw 9+ambKyMHtu12rTO1X566x/K4jg03UNZdVnZZzJulXFwhmOkJqgwuXnutYt9Pi6sBQDiAAAA 4gAAAABQBAvdeQCQJy2WjIbdYAACmgCsM6A//oKAABAB0AKQJ//+LZaMIAEIiBA3AAAAAAut DA8P7ikEAAAAAAKaAAAAoGViZjsowg7F9hraQP4tlowgAQiIEDcAAAAAC60MDw/uAAACmgAA AKBjpCaoMLl57ji00yCobvfrFIrbqlSwqS1M8jhA99bBJRFzP1nHORoobajWygdm+pXKGnJx 8Zod+20CeA4LtjFCrAealj8yPUlpdjQmaJLPEt/a24SuwO4JEWMLbGZKTq1XiqxDg0CD9YLO 7UhQ81pNptqu3T5Ed9G2i30+7AcGAFYAAABWAAAAAFAEC915AJAnLZaMht1gAAAAACA6//6A AAAAAAAAApAn//4tloz+gAAAAAAAAAJQBP/+C915hwA45AAAAAD+gAAAAAAAAAJQBP/+C915 AQEAkCctloy2i30+tQgGAFYAAABWAAAAAJAnLZaMAFAEC915ht1gAAAAACA6//6AAAAAAAAA AlAE//4L3Xn+gAAAAAAAAAKQJ//+LZaMiAAzWOAAAAD+gAAAAAAAAAJQBP/+C915AgEAUAQL 3Xm2i30+zQkGAOIAAADiAAAAAFAEC915AJAnLZaMht1gAAKaAKwzoT/+goAAEAHQApAn//4t lowgAQiIEDcAAAAAC60MDw/uKQQAAAAAApoAAACh2aCqL9jre0F7WF41/i2WjCABCIgQNwAA AAALrQwPD+4AAAKaAAAAoabart0+RHfRdB3tQZfmzaMS9lCnlDvhrXChF83STOdLJlSLtEdU oXsjRqM4rXilcCPCirDWkDLfeWa4EEvvX0wr8zKLF/PA7U8seZ8Lzm/RMUB8/qd4kSiYn5jt QVGoq2jP8mNfd3qgCAoLNy8oi0aGu2/aV7clpLeLfT4yZgYA4gAAAOIAAAAAUAQL3XkAkCct loyG3WAAApoArDOiP/6CgAAQAdACkCf//i2WjCABCIgQNwAAAAALrQwPD+4pBAAAAAACmgAA AKLZ2rs1yAo7bPez+I3+LZaMIAEIiBA3AAAAAAutDA8P7gAAApoAAACihrtv2le3JaSPPOWC B8KAx1m8U6KLuFGg1kdEe7vyqYxNCw/DkCCFa2uClwlgwaMFcjrC7DgyQpobZ03CSS8HyHBC TpBzeaeReoPG2q0ozMwdyrmg5ihkdvy/ZI0BoHjSc+bmgt2J2MXKaqbZ3IbXsnMwGZSCK5uI uIt9PrLzBQCGAQAAhgEAAACQJy2WjABQBAvdeYbdYAAAAAFQBkA//oKAABAB0AJQBP/+C915 P/6CgAAQAdACkCf//i2WjAAWgAXUBPthoSUPkIAYK/DEqgAAAQEICkQxqdgANao7up9s54MP XA9JU1bKNPkvycxB4fOboWFE4H5UTdDehobhUtWqmyODYC1w2lo47v+JEcsPZvTJ/BIc0fzj 34qRmpPkQa293q4kQyEhhQ/Nw2M9/hxMlg/PJztKn2GF1FkdMIZl1R9vUkmYcFxwuZldaw3k pIeE7dzz8+fj18sNE15UitV1mN5PMbnKzz7yf+oJmNApooKJOr6w5bNYSPcLOZ8mxHqGgPTq Q7wt5iBORiGbe+OqIoanZx6oT08UsESAC5Di3zV2C3BjIc4MpcZaHEi0jmvYY8ssD7MKbw6B sPLX41JLCX2KLQfgzF6Ue66guOchwu8DFByNLhI7kscjsh30Fm9cY23CvGNOIvhOVJFzs+4w NnOMcAgkXFae0Gwrx8MSjNExC64zxMCiVFncB7iLfT7p9AUA1gAAANYAAAAAkCctlowAUAQL 3XmG3WAAAAAAoAZAP/6CgAAQAdACUAT//gvdeT/+goAAEAHQApAn//4tlowAFoAF1AT8kaEl D5CAGCvwsxgAAAEBCApEManYADWqO2EdOBupMPx5qFt8t5oiUAlTzD/87rkaWWOPmg+kMRZV tLq4XI3KDZd09wbtwy6I5OCuHbXESZa+EYUHVXG8aKx6uKr6YBWE3x3bGZ/5sXAsqhbzHreg tbpRUypL3MlkVTxHNn7BuWwnu//jTGaXbeZ6qIN8UytkkUfaCrxyDQ/luIt9PnKPBgBWAAAA VgAAAABQBAvdeQCQJy2WjIbdYAAAAAAgBkA//oKAABAB0AKQJ//+LZaMP/6CgAAQAdACUAT/ /gvdeYAFABahJQ+Q1AT9EYAQ9XCGVgAAAQEICgA13/NEManYuIt9PqfIBgDiAAAA4gAAAABQ BAvdeQCQJy2WjIbdYAACmgCsM6M//oKAABAB0AKQJ//+LZaMIAEIiBA3AAAAAAutDA8P7ikE AAAAAAKaAAAAo9KvEsPg0bMUqWKv2P4tlowgAQiIEDcAAAAAC60MDw/uAAACmgAAAKNzMBmU giubiPFZD5hPKoqMRlBvDca4g5HN3M+5hgfzP0Qn3g4DJ5n4mRiRqv1A2YrabLlHyBcckGY8 a+AEj1m/m+uXhI2f3IYmTnCj0pl+NM/Q0LSQ6OJHIqq1SMYkFXuiFLfhgq9nXmTBRYayiiOB oBNMApHRi1I= --qMm9M+Fa2AknHoGS-- From kuznet@ms2.inr.ac.ru Sun Mar 23 16:02:28 2003 Received: with ECARTIS (v1.0.0; list netdev); Sun, 23 Mar 2003 16:02:40 -0800 (PST) Received: from sex.inr.ac.ru (sex.inr.ac.ru [193.233.7.165]) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h2O02Qq9024399 for ; Sun, 23 Mar 2003 16:02:28 -0800 Received: (from kuznet@localhost) by sex.inr.ac.ru (8.6.13/ANK) id DAA25577; Mon, 24 Mar 2003 03:02:14 +0300 From: kuznet@ms2.inr.ac.ru Message-Id: <200303240002.DAA25577@sex.inr.ac.ru> Subject: Re: Patch: minor nit in ip_options_compile() To: niv@us.ibm.com (Nivedita Singhvi) Date: Mon, 24 Mar 2003 03:02:13 +0300 (MSK) Cc: davem@redhat.com, netdev@oss.sgi.com In-Reply-To: <200303211956.50921.niv@us.ibm.com> from "Nivedita Singhvi" at Mar 21, 3 07:56:50 pm X-Mailer: ELM [version 2.4 PL24] MIME-Version: 1.0 X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) X-archive-position: 2036 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: kuznet@ms2.inr.ac.ru Precedence: bulk X-list: netdev Content-Length: 776 Lines: 30 Hello! > In the following else clause, we check for opt->is_data, which=20 > should always be set for this case, and if not, current code will > lead to a null ptr dereference because skb is always null in=20 > this case.. You misunderstood tha code. If opt->is_data is clear skb may be not NULL. The option is not used at the moment, but it it pretty silly to lose this. > Look reasonable?=20 No: > +=09=09if ((optptr =3D opt->__data) =3D=3D 0) > +=09=09=09goto error;=20 is identical FALSE. [Sent quatable-printable attachment? Get it back quoted. :-)] > Figured its better to fall down to returning EINVAL.. if (skb == NULL) BUG(); Listen, I love to add BUG()s when it is not mathematically clear that something is invariant. Not in this trivial case. Alexey From davem@redhat.com Sun Mar 23 21:32:41 2003 Received: with ECARTIS (v1.0.0; list netdev); Sun, 23 Mar 2003 21:32:45 -0800 (PST) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h2O5Weq9000877 for ; Sun, 23 Mar 2003 21:32:41 -0800 Received: from localhost (IDENT:davem@localhost.localdomain [127.0.0.1]) by pizda.ninka.net (8.9.3/8.9.3) with ESMTP id VAA26524; Sun, 23 Mar 2003 21:29:51 -0800 Date: Sun, 23 Mar 2003 21:29:50 -0800 (PST) Message-Id: <20030323.212950.05858732.davem@redhat.com> To: mk@linux-ipv6.org Cc: kuznet@ms2.inr.ac.ru, linux-kernel@vger.kernel.org, netdev@oss.sgi.com, usagi-core@linux-ipv6.org Subject: Re: [PATCH] IPv6 Extension headers From: "David S. Miller" In-Reply-To: <87of48h6f8.wl@karaba.org> References: <20030306093219.1a702868.kazunori@miyazawa.org> <20030305.204348.130225511.davem@redhat.com> <87of48h6f8.wl@karaba.org> X-FalunGong: Information control. X-Mailer: Mew version 2.1 on Emacs 21.1 / Mule 5.0 (SAKAKI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=iso-2022-jp Content-Transfer-Encoding: 7bit X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) X-archive-position: 2037 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev Content-Length: 537 Lines: 16 From: Mitsuru KANDA / $B?@ED(B $B=<(B Date: Tue, 18 Mar 2003 10:32:27 -0800 Could you check this patch? (This patch is against 2.5.65.) I applied this patch with some minor changes. First, many functions in net/ipv6/exthdrs.c and net/ipv6/reassembly.c can be marked static now. Second, some local variables (for example, "nhoff" in ip6_input()) can be eliminated entirely because they compute a value in one place and use it in the very next line and nowhere else is it referenced. Thank you. From davem@redhat.com Sun Mar 23 21:40:20 2003 Received: with ECARTIS (v1.0.0; list netdev); Sun, 23 Mar 2003 21:40:22 -0800 (PST) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h2O5dcq9001241 for ; Sun, 23 Mar 2003 21:40:19 -0800 Received: from localhost (IDENT:davem@localhost.localdomain [127.0.0.1]) by pizda.ninka.net (8.9.3/8.9.3) with ESMTP id VAA26551; Sun, 23 Mar 2003 21:37:18 -0800 Date: Sun, 23 Mar 2003 21:37:18 -0800 (PST) Message-Id: <20030323.213718.83707259.davem@redhat.com> To: yoshfuji@linux-ipv6.org Cc: linux-kernel@vger.kernel.org, netdev@oss.sgi.com, kuznet@ms2.inr.ac.ru, usagi@linux-ipv6.org Subject: Re: [PATCH] IPv6: use "const" qualifier From: "David S. Miller" In-Reply-To: <20030323.013528.19572208.yoshfuji@linux-ipv6.org> References: <20030323.013528.19572208.yoshfuji@linux-ipv6.org> X-FalunGong: Information control. X-Mailer: Mew version 2.1 on Emacs 21.1 / Mule 5.0 (SAKAKI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=iso-2022-jp Content-Transfer-Encoding: 7bit X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) X-archive-position: 2038 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev Content-Length: 334 Lines: 10 From: YOSHIFUJI Hideaki / $B5HF#1QL@(B Date: Sun, 23 Mar 2003 01:35:28 +0900 (JST) Specify some arguments of IPv6 address manipulation / testing functions "const" qualifier. Patch is against linux-2.5.64 + ChangeSet 1.1188. This should be suitable for linux-2.4.x. Applied, thanks. From davem@redhat.com Sun Mar 23 21:42:41 2003 Received: with ECARTIS (v1.0.0; list netdev); Sun, 23 Mar 2003 21:42:43 -0800 (PST) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h2O5gXq9001595 for ; Sun, 23 Mar 2003 21:42:37 -0800 Received: from localhost (IDENT:davem@localhost.localdomain [127.0.0.1]) by pizda.ninka.net (8.9.3/8.9.3) with ESMTP id VAA26563; Sun, 23 Mar 2003 21:40:07 -0800 Date: Sun, 23 Mar 2003 21:40:07 -0800 (PST) Message-Id: <20030323.214007.96586811.davem@redhat.com> To: yoshfuji@linux-ipv6.org Cc: linux-kernel@vger.kernel.org, netdev@oss.sgi.com, kuznet@ms2.inr.ac.ru, usagi@linux-ipv6.org Subject: Re: [PATCH] IPv6: use RFC2553 constant From: "David S. Miller" In-Reply-To: <20030323.013532.24422763.yoshfuji@linux-ipv6.org> References: <20030323.013532.24422763.yoshfuji@linux-ipv6.org> X-FalunGong: Information control. X-Mailer: Mew version 2.1 on Emacs 21.1 / Mule 5.0 (SAKAKI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=iso-2022-jp Content-Transfer-Encoding: 7bit X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) X-archive-position: 2039 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev Content-Length: 176 Lines: 6 From: YOSHIFUJI Hideaki / $B5HF#1QL@(B Date: Sun, 23 Mar 2003 01:35:32 +0900 (JST) Use RFC2553 constant variable. Applied, thank you. From davem@redhat.com Sun Mar 23 21:44:21 2003 Received: with ECARTIS (v1.0.0; list netdev); Sun, 23 Mar 2003 21:44:23 -0800 (PST) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h2O5iKq9001924 for ; Sun, 23 Mar 2003 21:44:20 -0800 Received: from localhost (IDENT:davem@localhost.localdomain [127.0.0.1]) by pizda.ninka.net (8.9.3/8.9.3) with ESMTP id VAA26573; Sun, 23 Mar 2003 21:41:48 -0800 Date: Sun, 23 Mar 2003 21:41:47 -0800 (PST) Message-Id: <20030323.214147.104238225.davem@redhat.com> To: yoshfuji@linux-ipv6.org Cc: linux-kernel@vger.kernel.org, netdev@oss.sgi.com, kuznet@ms2.inr.ac.ru, usagi@linux-ipv6.org Subject: Re: [PATCH] IPv6: use ipv6_addr_any() for testing unspecified address From: "David S. Miller" In-Reply-To: <20030323.013535.60875023.yoshfuji@linux-ipv6.org> References: <20030323.013535.60875023.yoshfuji@linux-ipv6.org> X-FalunGong: Information control. X-Mailer: Mew version 2.1 on Emacs 21.1 / Mule 5.0 (SAKAKI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=iso-2022-jp Content-Transfer-Encoding: 7bit X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) X-archive-position: 2040 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev Content-Length: 195 Lines: 6 From: YOSHIFUJI Hideaki / $B5HF#1QL@(B Date: Sun, 23 Mar 2003 01:35:35 +0900 (JST) Use ipv6_addr_any() for testing unspecified address. Applied, thank you. From Eric.Lemoine@Sun.com Sun Mar 23 23:25:36 2003 Received: with ECARTIS (v1.0.0; list netdev); Sun, 23 Mar 2003 23:25:39 -0800 (PST) Received: from mailhost.ens-lyon.fr (pluvier.ens-lyon.fr [140.77.167.5]) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h2O7Otq9002939 for ; Sun, 23 Mar 2003 23:25:36 -0800 Received: from udine.cri2000.ens-lyon.fr ([140.77.13.101]) by mailhost.ens-lyon.fr with smtp (Exim 3.35 #1 (Debian)) id 18wR4k-0001tK-00 for ; Fri, 21 Mar 2003 19:17:06 +0100 Received: by udine.cri2000.ens-lyon.fr (sSMTP sendmail emulation); Fri, 21 Mar 2003 19:17:06 +0100 Date: Fri, 21 Mar 2003 19:17:06 +0100 From: Eric Lemoine To: netdev@oss.sgi.com Subject: EmbryonicRsts between two Linux boxes Message-ID: <20030321181706.GD1103@udine> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.3.28i X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) X-archive-position: 2041 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: Eric.Lemoine@Sun.com Precedence: bulk X-list: netdev Content-Length: 759 Lines: 30 Hi, I have three Linux boxes on the same physical network. One runs a web server (webfs) and the other two run an HTTP traffic generator. Under high load, with many simultaneous open connections (~1000) one of the clients gets a TCP RST ("read: Connection reset by peer"). netstat -s on the server box gives: ... TcpExt: 3002 resets received for embryonic SYN_RECV sockets ... Which, if I understand correctly, means that the server TCP stack receives a SYN|ACK or a RST|ACK from one of the clients for a SYN_RECV socket. How can this happen? Thx. PS: there's 100ms network latency between the server and the clients (emulated by NistNet). And I want to saturate the server. That's the reason I need so many simultaneous connections. -- Eric From toml@us.ibm.com Mon Mar 24 14:31:09 2003 Received: with ECARTIS (v1.0.0; list netdev); Mon, 24 Mar 2003 14:31:19 -0800 (PST) Received: from e4.ny.us.ibm.com (e4.ny.us.ibm.com [32.97.182.104]) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h2OMV1q9006627 for ; Mon, 24 Mar 2003 14:31:09 -0800 Received: from northrelay04.pok.ibm.com (northrelay04.pok.ibm.com [9.56.224.206]) by e4.ny.us.ibm.com (8.12.8/8.12.2) with ESMTP id h2OMUChF069128; Mon, 24 Mar 2003 17:30:13 -0500 Received: from tomlt2.austin.ibm.com (tomlt2.austin.ibm.com [9.41.94.20]) by northrelay04.pok.ibm.com (8.12.8/NCO/VER6.5) with ESMTP id h2OMU7ir168792; Mon, 24 Mar 2003 17:30:10 -0500 Subject: [PATCH] IPSec: IPv6 UDP policy checking From: Tom Lendacky To: netdev@oss.sgi.com Cc: davem@redhat.com, kuznet@ms2.inr.ac.ru, toml@us.ibm.com Content-Type: text/plain Content-Transfer-Encoding: 7bit X-Mailer: Ximian Evolution 1.0.8 (1.0.8-10) Date: 24 Mar 2003 16:31:29 -0600 Message-Id: <1048545094.1530.25.camel@tomlt2.tomloffice.austin.ibm.com> Mime-Version: 1.0 X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) X-archive-position: 2047 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: toml@us.ibm.com Precedence: bulk X-list: netdev Content-Length: 1046 Lines: 35 After getting racoon to be able to listen on an IPv6 address, it wasn't receiving any of the IKE messages being sent to it. The following patch fixes the problem and is consistent with when and how the IPv4 UDP code invokes xfrm_policy_check. Please review to be sure this is acceptable. Thanks, Tom --- linux-2.5.65-orig/net/ipv6/udp.c 2003-03-17 15:44:41.000000000 -0600 +++ linux-2.5.65/net/ipv6/udp.c 2003-03-24 16:28:02.000000000 -0600 @@ -652,9 +652,6 @@ if (!pskb_may_pull(skb, sizeof(struct udphdr))) goto short_packet; - if (!xfrm6_policy_check(NULL, XFRM_POLICY_IN, skb)) - goto discard; - saddr = &skb->nh.ipv6h->saddr; daddr = &skb->nh.ipv6h->daddr; uh = skb->h.uh; @@ -712,6 +709,9 @@ sk = udp_v6_lookup(saddr, uh->source, daddr, uh->dest, dev->ifindex); if (sk == NULL) { + if (!xfrm6_policy_check(NULL, XFRM_POLICY_IN, skb)) + goto discard; + if (skb->ip_summed != CHECKSUM_UNNECESSARY && (unsigned short)csum_fold(skb_checksum(skb, 0, skb->len, skb->csum))) goto discard; From davem@redhat.com Mon Mar 24 17:17:57 2003 Received: with ECARTIS (v1.0.0; list netdev); Mon, 24 Mar 2003 17:18:09 -0800 (PST) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h2P1Hsq9011861 for ; Mon, 24 Mar 2003 17:17:56 -0800 Received: from localhost (IDENT:davem@localhost.localdomain [127.0.0.1]) by pizda.ninka.net (8.9.3/8.9.3) with ESMTP id RAA28953; Mon, 24 Mar 2003 17:15:26 -0800 Date: Mon, 24 Mar 2003 17:15:26 -0800 (PST) Message-Id: <20030324.171526.31619297.davem@redhat.com> To: toml@us.ibm.com Cc: netdev@oss.sgi.com, kuznet@ms2.inr.ac.ru, yoshfuji@linux-ipv6.org Subject: Re: [PATCH] IPSec: IPv6 UDP policy checking From: "David S. Miller" In-Reply-To: <1048545094.1530.25.camel@tomlt2.tomloffice.austin.ibm.com> References: <1048545094.1530.25.camel@tomlt2.tomloffice.austin.ibm.com> X-FalunGong: Information control. X-Mailer: Mew version 2.1 on Emacs 21.1 / Mule 5.0 (SAKAKI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) X-archive-position: 2048 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev Content-Length: 1283 Lines: 38 From: Tom Lendacky Date: 24 Mar 2003 16:31:29 -0600 After getting racoon to be able to listen on an IPv6 address, it wasn't receiving any of the IKE messages being sent to it. The following patch fixes the problem and is consistent with when and how the IPv4 UDP code invokes xfrm_policy_check. Please review to be sure this is acceptable. I have applied this patch. Thanks for finding this bug Tom. Yoshfuji, please note of this patch below from Tom which I have added to my tree. --- linux-2.5.65-orig/net/ipv6/udp.c 2003-03-17 15:44:41.000000000 -0600 +++ linux-2.5.65/net/ipv6/udp.c 2003-03-24 16:28:02.000000000 -0600 @@ -652,9 +652,6 @@ if (!pskb_may_pull(skb, sizeof(struct udphdr))) goto short_packet; - if (!xfrm6_policy_check(NULL, XFRM_POLICY_IN, skb)) - goto discard; - saddr = &skb->nh.ipv6h->saddr; daddr = &skb->nh.ipv6h->daddr; uh = skb->h.uh; @@ -712,6 +709,9 @@ sk = udp_v6_lookup(saddr, uh->source, daddr, uh->dest, dev->ifindex); if (sk == NULL) { + if (!xfrm6_policy_check(NULL, XFRM_POLICY_IN, skb)) + goto discard; + if (skb->ip_summed != CHECKSUM_UNNECESSARY && (unsigned short)csum_fold(skb_checksum(skb, 0, skb->len, skb->csum))) goto discard; From ahu@outpost.ds9a.nl Tue Mar 25 04:01:34 2003 Received: with ECARTIS (v1.0.0; list netdev); Tue, 25 Mar 2003 04:01:37 -0800 (PST) Received: from outpost.ds9a.nl (postfix@outpost.ds9a.nl [213.244.168.210]) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h2PC1Mq9024831 for ; Tue, 25 Mar 2003 04:01:24 -0800 Received: by outpost.ds9a.nl (Postfix, from userid 1000) id AF61B44EE; Tue, 25 Mar 2003 12:27:05 +0100 (CET) Date: Tue, 25 Mar 2003 12:27:05 +0100 From: bert hubert To: netdev@oss.sgi.com Subject: ip6sec broken *differently* in 2.5.66 :-) Message-ID: <20030325112705.GA9793@outpost.ds9a.nl> Mail-Followup-To: bert hubert , netdev@oss.sgi.com Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.3.28i X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) X-archive-position: 2049 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: ahu@ds9a.nl Precedence: bulk X-list: netdev Content-Length: 1115 Lines: 32 Alexey, thanks for the work on ip6sec but it still does slightly the wrong thing here. snapcount:~# ping6 -s 1406 deef.ds9a.nl 43.14 snapcount.ipv6.ds9a.nl > deef.ds9a.nl: AH(spi=0x00003d54,sumlen=16,seq=0x1e7): ESP(spi=0x00003d55,seq=0x1e7) [hlim 0] (len 1456) 43.14 deef.ds9a.nl > snapcount.ipv6.ds9a.nl: icmp6: echo reply (len 1414, hlim 64) If I make the ping one byte bigger, nothing gets sent out! This goes on until ping6 -s 1452, and then suddenly with ping6 -s 1453: 43.57 snapcount.ipv6.ds9a.nl > deef.ds9a.nl: AH(spi=0x00003d54,sumlen=16,seq=0x327): ESP(spi=0x00003d55,seq=0x327) [hlim 0] (len 64) Like the first segment didn't make it. If I further increase the pingsize to -s 1455, this happens: 34.02 snapcount.ipv6.ds9a.nl > deef.ds9a.nl: AH(spi=0x00003d54,sumlen=16,seq=0x409): ESP(spi=0x00003d55,seq=0x409) [hlim 0] (len 72) Anything I can do to help, let me know. Thanks. -- http://www.PowerDNS.com Open source, database driven DNS Software http://lartc.org Linux Advanced Routing & Traffic Control HOWTO http://netherlabs.nl Consulting From mator@gsib.sl.ru Tue Mar 25 07:36:30 2003 Received: with ECARTIS (v1.0.0; list netdev); Tue, 25 Mar 2003 07:36:41 -0800 (PST) Received: from gsib.sl.ru (IDENT:SYSTEM@gsib.sl.ru [217.171.66.34]) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h2PFaRq9031415 for ; Tue, 25 Mar 2003 07:36:29 -0800 Received: from gsib.sl.ru (IDENT:mator@localhost [127.0.0.1]) by gsib.sl.ru (8.12.8/8.12.8/1.0) with ESMTP id h2PFcZ9N014466 for ; Tue, 25 Mar 2003 18:38:35 +0300 Received: (from mator@localhost) by gsib.sl.ru (8.12.8/8.12.5/Submit) id h2PFcYjg014464 for netdev@oss.sgi.com; Tue, 25 Mar 2003 18:38:34 +0300 Date: Tue, 25 Mar 2003 18:38:34 +0300 From: Anatoly Pugachev To: netdev@oss.sgi.com Subject: minor error in linux/net/core/dev.c Message-ID: <20030325153834.GJ16269@gsib.sl.ru> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.4i X-Accept-Language: en,ru X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) X-archive-position: 2050 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: mator@gsib.ru Precedence: bulk X-list: netdev Content-Length: 710 Lines: 20 Hello. I can't fix this myself, so emailing to you: [root@p4 mator]# modprobe e100 /var/log/messages: Mar 25 17:26:46 p4 kernel: Intel(R) PRO/100 Network Driver - version 2.2.21-k1 Mar 25 17:26:46 p4 kernel: Copyright (c) 2003 Intel Corporation Mar 25 17:26:46 p4 kernel: Mar 25 17:26:46 p4 kernel: e100: selftest OK. Mar 25 17:26:49 p4 kernel: Freeing alive device f6623800, eth%%d Mar 25 17:26:50 p4 kernel: e100: eth1: Intel(R) PRO/100 Network Connection Mar 25 17:26:50 p4 kernel: Hardware receive checksums enabled Mar 25 17:26:50 p4 kernel: cpu cycle saver enabled notice eth%%d output, i have found "Freeing alive device" in linux/net/core/dev.c netdev_finish_unregister() thanks, bye. -- /mator From jgrimm2@us.ibm.com Tue Mar 25 09:10:44 2003 Received: with ECARTIS (v1.0.0; list netdev); Tue, 25 Mar 2003 09:10:49 -0800 (PST) Received: from e2.ny.us.ibm.com (e2.ny.us.ibm.com [32.97.182.102]) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h2PH9uq9000612 for ; Tue, 25 Mar 2003 09:10:43 -0800 Received: from northrelay04.pok.ibm.com (northrelay04.pok.ibm.com [9.56.224.206]) by e2.ny.us.ibm.com (8.12.8/8.12.2) with ESMTP id h2PH9oE0074794; Tue, 25 Mar 2003 12:09:50 -0500 Received: from us.ibm.com (touki.austin.ibm.com [9.41.94.47]) by northrelay04.pok.ibm.com (8.12.8/NCO/VER6.5) with ESMTP id h2PH9lEg036582; Tue, 25 Mar 2003 12:09:48 -0500 Message-ID: <3E80894A.59C6EB95@us.ibm.com> Date: Tue, 25 Mar 2003 10:52:26 -0600 From: Jon Grimm X-Mailer: Mozilla 4.79 [en] (X11; U; Linux 2.5.65 i686) X-Accept-Language: en MIME-Version: 1.0 To: "linux-net@vger.kernel.org" , "netdev@oss.sgi.com" CC: "David S. Miller" Subject: [PATCH] Export a few icmpv6 symbols Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) X-archive-position: 2051 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jgrimm2@us.ibm.com Precedence: bulk X-list: netdev Content-Length: 500 Lines: 17 Export a few icmpv6 symbols so SCTP can have at them from its v6 err_handler. Thanks, Jon Grimm --- linux-2.5.66/net/ipv6/ipv6_syms.c Tue Mar 25 08:13:07 2003 +++ lksctp-2.5.work/net/ipv6/ipv6_syms.c Tue Mar 25 10:27:43 2003 @@ -9,6 +9,8 @@ EXPORT_SYMBOL(ipv6_addr_type); EXPORT_SYMBOL(icmpv6_send); +EXPORT_SYMBOL(icmpv6_statistics); +EXPORT_SYMBOL(icmpv6_err_convert); EXPORT_SYMBOL(ndisc_mc_map); EXPORT_SYMBOL(register_inet6addr_notifier); EXPORT_SYMBOL(unregister_inet6addr_notifier); From davem@redhat.com Tue Mar 25 22:33:41 2003 Received: with ECARTIS (v1.0.0; list netdev); Tue, 25 Mar 2003 22:33:44 -0800 (PST) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h2Q6Xeq9015568 for ; Tue, 25 Mar 2003 22:33:41 -0800 Received: from localhost (IDENT:davem@localhost.localdomain [127.0.0.1]) by pizda.ninka.net (8.9.3/8.9.3) with ESMTP id WAA31792; Tue, 25 Mar 2003 22:30:54 -0800 Date: Tue, 25 Mar 2003 22:30:53 -0800 (PST) Message-Id: <20030325.223053.15735470.davem@redhat.com> To: jgrimm2@us.ibm.com Cc: linux-net@vger.kernel.org, netdev@oss.sgi.com Subject: Re: [PATCH] Export a few icmpv6 symbols From: "David S. Miller" In-Reply-To: <3E80894A.59C6EB95@us.ibm.com> References: <3E80894A.59C6EB95@us.ibm.com> X-FalunGong: Information control. X-Mailer: Mew version 2.1 on Emacs 21.1 / Mule 5.0 (SAKAKI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) X-archive-position: 2053 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev Content-Length: 192 Lines: 7 From: Jon Grimm Date: Tue, 25 Mar 2003 10:52:26 -0600 Export a few icmpv6 symbols so SCTP can have at them from its v6 err_handler. Applied, thanks Jon. From mcmanus@datapower.ducksong.com Wed Mar 26 14:31:28 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 26 Mar 2003 14:31:31 -0800 (PST) Received: from datapower.ducksong.com (ip67-93-141-189.z141-93-67.customer.algx.net [67.93.141.189]) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h2QMVQq9021159 for ; Wed, 26 Mar 2003 14:31:27 -0800 Received: (from mcmanus@localhost) by datapower.ducksong.com (8.11.6/8.11.6) id h2QMVg602235 for netdev@oss.sgi.com; Wed, 26 Mar 2003 17:31:42 -0500 Date: Wed, 26 Mar 2003 17:31:42 -0500 From: "Patrick R. McManus" To: netdev@oss.sgi.com Subject: Intel 1000 MT slow to restart Message-ID: <20030326223142.GA2032@ducksong.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.4i X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) X-archive-position: 2054 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: mcmanus@ducksong.com Precedence: bulk X-list: netdev Content-Length: 895 Lines: 23 I'm running intel's e1000 driver (versions 4.4.12 and tried 4.6.11) for a 1000/MT (dual) nic on a 2.4.19 kernel. for normal operations, everything is fine.. however if I do a "ifconfig eth0 down; ifconfig eth0 up" 20 or 30 seconds passes where I can't do any traffic. the ifconfig up comes back quickly, and ifconfig reports the driver as up (and netlink thinks its up if I try and add another one).. I just can't move any traffic. no rx's and my tx's only move counters in the driver - they don't actually make it to the wire. if I do "ifconfig eth0 down; sleep 30; ifconfig eth0 up" all is well immediately after the up.. The driver doesn't seem to be doing anything different in either case - its as if the NIC itself is enforcing some kind of quiet period. I'm really writing to see if I'm crazy and hoping someone else can corroborate this is what they've seen too. help? -Patrick From rddunlap@osdl.org Wed Mar 26 14:31:38 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 26 Mar 2003 14:31:47 -0800 (PST) Received: from mail.osdl.org (air-2.osdl.org [65.172.181.6]) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h2QMVaq9021172 for ; Wed, 26 Mar 2003 14:31:37 -0800 Received: from dragon.pdx.osdl.net (dragon.pdx.osdl.net [172.20.1.27]) by mail.osdl.org (8.11.6/8.11.6) with SMTP id h2QMVWW00626; Wed, 26 Mar 2003 14:31:32 -0800 Date: Wed, 26 Mar 2003 14:27:24 -0800 From: "Randy.Dunlap" To: linux-net@vger.kernel.org Cc: netdev@oss.sgi.com Subject: tcp_mib questions Message-Id: <20030326142724.7e8cb865.rddunlap@osdl.org> Organization: OSDL X-Mailer: Sylpheed version 0.8.11 (GTK+ 1.2.10; i586-pc-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) X-archive-position: 2055 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: rddunlap@osdl.org Precedence: bulk X-list: netdev Content-Length: 605 Lines: 27 Hi, Looking at the Linux tcp_mib, these fields are not updated (set) AFAICT: 1. unsigned long TcpRtoAlgorithm; 2. unsigned long TcpRtoMin; 3. unsigned long TcpRtoMax; What retransmit algorithm does Linux use? or are there several? I see comments/references to Van Jacobson's algorithm. Is that the one used, or a variant of it? 4. unsigned long TcpMaxConn; If TcpMaxConn is dynamic, it can be set to -1. 5. unsigned long TcpEstabResets; Just not updated anywhere. Is there a (historical) reason for this, or it's just omitted, or not wanted? Any comments or corrections? Thanks, -- ~Randy From seong@etri.re.kr Wed Mar 26 17:42:17 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 26 Mar 2003 17:42:23 -0800 (PST) Received: from cms1.etri.re.kr (cms1.etri.re.kr [129.254.16.11]) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h2R1faq9000460 for ; Wed, 26 Mar 2003 17:42:17 -0800 Received: from SEONG ([129.254.172.40]) by cms1.etri.re.kr with SMTP (Microsoft Exchange Internet Mail Service Version 5.5.2653.13) id HP0CV1FZ; Thu, 27 Mar 2003 10:41:04 +0900 Message-ID: <003401c2f402$3fbc4450$28acfe81@seong> From: "Seong Moon" To: Subject: IP, MAC address duplication detection ? Date: Thu, 27 Mar 2003 10:43:21 +0900 MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 5.50.4920.2300 X-MimeOLE: Produced By Microsoft MimeOLE V5.50.4920.2300 X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) X-archive-position: 2056 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: seong@etri.re.kr Precedence: bulk X-list: netdev Content-Length: 486 Lines: 15 Hi, there. In Linux box, How can I detect IP/MAC address duplication? I'm using kernel-2.4.18 but the kernel does not seem to have gratuitous arp implementation. Is it right? I know I can detect IP address duplication by arping program But I want to implment a following mechanism. When the linux machine bootstraps or one of the nework interfaces is assigned a MAC/IP address, the linux box can detect the duplication of newly assigned MAC/IP address. How can I do this ? thanks. From cfriesen@nortelnetworks.com Wed Mar 26 23:01:41 2003 Received: with ECARTIS (v1.0.0; list netdev); Wed, 26 Mar 2003 23:01:48 -0800 (PST) Received: from zcars04f.nortelnetworks.com (zcars04f.nortelnetworks.com [47.129.242.57]) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h2R71dq9022640 for ; Wed, 26 Mar 2003 23:01:41 -0800 Received: from zcard309.ca.nortel.com (zcard309.ca.nortel.com [47.129.242.69]) by zcars04f.nortelnetworks.com (Switch-2.2.5/Switch-2.2.0) with ESMTP id h2R71WM11499; Thu, 27 Mar 2003 02:01:32 -0500 (EST) Received: from zcard0k6.ca.nortel.com ([47.129.242.158]) by zcard309.ca.nortel.com with SMTP (Microsoft Exchange Internet Mail Service Version 5.5.2653.13) id GDF48QDC; Thu, 27 Mar 2003 02:01:32 -0500 Received: from pcard0ks.ca.nortel.com ([47.129.117.131]) by zcard0k6.ca.nortel.com with SMTP (Microsoft Exchange Internet Mail Service Version 5.5.2653.13) id GRL0HDRA; Thu, 27 Mar 2003 02:01:33 -0500 Received: from nortelnetworks.com (localhost.localdomain [127.0.0.1]) by pcard0ks.ca.nortel.com (Postfix) with ESMTP id A2C332E136; Thu, 27 Mar 2003 02:01:31 -0500 (EST) Message-ID: <3E82A1CB.7090408@nortelnetworks.com> Date: Thu, 27 Mar 2003 02:01:31 -0500 X-Sybari-Space: 00000000 00000000 00000000 From: Chris Friesen User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:0.9.8) Gecko/20020204 X-Accept-Language: en-us MIME-Version: 1.0 To: Linux Kernel Mailing List , netdev@oss.sgi.com, linux-net@vger.kernel.org Subject: [RFC|PATCH] AF_UNIX multicast capability for 2.5.66 Content-Type: multipart/mixed; boundary="------------020204070109060409000206" X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) X-archive-position: 2058 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: cfriesen@nortelnetworks.com Precedence: bulk X-list: netdev Content-Length: 18319 Lines: 703 This is a multi-part message in MIME format. --------------020204070109060409000206 Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit For those who didn't read the first couple threads, this patch adds multicast functionality to unix sockets, in similar fashion (and using a similar API) as UDP multicast, but easier to use. To use the functionality, in userspace you would allocate and bind a socket as normal in the AF_UNIX family, and then you would use setsockop() to associate yourself with one or more multicast addresses, also in the AF_UNIX family. Any message sent to a multicast address gets duplicated by the kernel and distributed to all processes associated with that address. If an address already exists and is not multicast, you cannot associate yourself with it using setsockopt(), and if it exists and is multicast, you cannot bind() do it. All AF_UNIX addresses exist in the same namespace and must be either multicast or unicast. It does not make sense to allow streaming to a multicast address, so I plan on disallowing this (and other similar things) in a future release of the patch. I'm not sure about allowing datagram sockets to connect() to a multicast address, I haven't looked at the code in depth. At any rate, here are the results of some testing on the latest kernel comparing kernel multicast against a userspace solution. There is a sender program which sends messages of various sizes to various numbers of listeners. Each message has a timestamp embedded within it, and the listeners determine the latency between the sending and receiving of the message. In the userspace solution, the list of listeners is kept in a shared memory area, which is faster than using a distributor process. In the interests of figuring out the best possible performance I've changed the testing methodology from the last bunch of tests, and the organization of the results is slightly different to make it easier to add new results. In this test sequence the sender and receiver processes have been run with nice -20 to minimize interference from other entities in the system. These are best-case results from a number of runs, but the numbers are fairly consistant across runs. data size and kernel number of listeners 44 bytes 10 20 50 100 200 2.5.66 userspace 134,297 206,561 416,1401 761,2824 1500,5711 2.5.66 kernelspace 75,232 119,457 213,1142 356,2308 679,4710 236 bytes 2.5.66 userspace 143,306 218,584 447,1472 814,3013 1399,6007 2.5.66 kernelspace 80,244 115,469 216,1176 365,2371 682,4893 40036 bytes 2.5.66 userspace 478,3613 497,7405 496,18114 487,36759 518,74566 2.5.66 kernelspace 287,1672 327,3299 444,8129 663,16125 1000,31937 The numbers definately favour a kernel-space solution. That said, it would be possible to implement this using UDP messaging, but UDP latency is generally about 30 percent higher than AF_UNIX on my 1.8GHz P4, and it's more of a hassle to configure UDP multicast. I would appreciate any comments on the patch, if you see any technical bugs or if you think there is a better way to do something please do let me know. I'm sure there's some fine point about locking that I missed, or something similar. Thanks, Chris -- Chris Friesen | MailStop: 043/33/F10 Nortel Networks | work: (613) 765-0557 3500 Carling Avenue | fax: (613) 765-2986 Nepean, ON K2H 8E9 Canada | email: cfriesen@nortelnetworks.com --------------020204070109060409000206 Content-Type: text/plain; name="unixmcast-2.5.66.patch" Content-Transfer-Encoding: 7bit Content-Disposition: inline; filename="unixmcast-2.5.66.patch" diff -Nru a/include/linux/un.h b/include/linux/un.h --- a/include/linux/un.h Thu Mar 27 01:53:11 2003 +++ b/include/linux/un.h Thu Mar 27 01:53:11 2003 @@ -8,4 +8,12 @@ char sun_path[UNIX_PATH_MAX]; /* pathname */ }; +#ifdef CONFIG_UNIX_MULTICAST + +#define UNIX_ADD_MEMBERSHIP 35 +#define UNIX_DROP_MEMBERSHIP 36 + +#endif + + #endif /* _LINUX_UN_H */ diff -Nru a/include/net/af_unix.h b/include/net/af_unix.h --- a/include/net/af_unix.h Thu Mar 27 01:53:11 2003 +++ b/include/net/af_unix.h Thu Mar 27 01:53:11 2003 @@ -61,6 +61,20 @@ #define unix_state_wunlock(s) write_unlock(&unix_sk(s)->lock) #ifdef __KERNEL__ + +#ifdef CONFIG_UNIX_MULTICAST +struct unix_mcast +{ + unix_socket *listener; + unix_socket *addr; + struct unix_mcast *nextlistener; + struct unix_mcast *prevlistener; + struct unix_mcast *nextaddr; + struct unix_mcast *prevaddr; +}; + +#endif + /* The AF_UNIX socket */ struct unix_sock { /* WARNING: sk has to be the first member */ @@ -75,6 +89,10 @@ atomic_t inflight; rwlock_t lock; wait_queue_head_t peer_wait; +#ifdef CONFIG_UNIX_MULTICAST + int is_mcast_addr; + struct unix_mcast *mcastnode; +#endif }; #define unix_sk(__sk) ((struct unix_sock *)__sk) #endif diff -Nru a/net/Kconfig b/net/Kconfig --- a/net/Kconfig Thu Mar 27 01:53:11 2003 +++ b/net/Kconfig Thu Mar 27 01:53:11 2003 @@ -157,6 +157,19 @@ Say Y unless you know what you are doing. +config UNIX_MULTICAST + bool "Unix domain multicasting (EXPERIMENTAL)" + depends on UNIX && EXPERIMENTAL + ---help--- + If you say Y here you will include support for multicast unix domain + sockets. Multiple sockets can add themselves to a multicast address + group, and any packet sent to the multicast address will be distributed + to all unix sockets that have associated themselves with the multicast + address. + + This code is experimental. Say N unless you want to try efficient + one-sender many-listeners messaging. + config NET_KEY tristate "PF_KEY sockets" ---help--- diff -Nru a/net/unix/af_unix.c b/net/unix/af_unix.c --- a/net/unix/af_unix.c Thu Mar 27 01:53:11 2003 +++ b/net/unix/af_unix.c Thu Mar 27 01:53:11 2003 @@ -172,6 +172,28 @@ kfree(addr); } +#ifdef CONFIG_UNIX_MULTICAST +//call with write locks held on both sockets that have links to the node +static void unlink_mcast_node(struct unix_mcast *node) +{ + if (node->prevlistener==NULL) + unix_sk(node->addr)->mcastnode = node->nextlistener; + else + node->prevlistener->nextlistener = node->nextlistener; + + if (node->nextlistener!=NULL) + node->nextlistener->prevlistener = node->prevlistener; + + if (node->prevaddr==NULL) + unix_sk(node->listener)->mcastnode = node->nextaddr; + else + node->prevaddr->nextaddr = node->nextaddr; + if (node->nextaddr!=NULL) + node->nextaddr->prevaddr = node->prevaddr; +} +#endif + + /* * Check unix socket name: * - should be not zero length. @@ -342,7 +364,7 @@ static void unix_sock_destructor(struct sock *sk) { struct unix_sock *u = unix_sk(sk); - + skb_queue_purge(&sk->receive_queue); BUG_TRAP(atomic_read(&sk->wmem_alloc) == 0); @@ -363,6 +385,60 @@ MOD_DEC_USE_COUNT; } +#ifdef CONFIG_UNIX_MULTICAST +//must hold wlock on sk before calling +static void unix_release_mcast(unix_socket *sk) +{ + struct unix_sock *u = unix_sk(sk); + struct unix_mcast *node; + struct unix_mcast *oldnode; + unix_socket *other; + struct unix_sock *o; + struct socket *releasesock; + + if (!u->mcastnode) + return; + + //otherwise we want to walk the chain and unlink from any multicast + //addresses with which we are registered + node = u->mcastnode; + + while(node!=NULL){ + other=node->addr; + o = unix_sk(other); + unix_state_wlock(other); + + unlink_mcast_node(node); + + sock_put(sk); + sock_put(other); + + //if the socket has no more listeners, clean it up + if (!o->mcastnode) + releasesock=o->sk.socket; + else + releasesock=NULL; + + unix_state_wunlock(other); + + oldnode=node; + node=node->nextaddr; + + //printk("freeing multicast node at %p\n",oldnode); + kfree(oldnode); + + if (releasesock) { + //printk("releasing multicast socket at %p\n",releasesock); + sock_release(releasesock); + } + } + + return; +} +#endif + + + static int unix_release_sock (unix_socket *sk, int embrion) { struct unix_sock *u = unix_sk(sk); @@ -376,6 +452,10 @@ /* Clear state */ unix_state_wlock(sk); + +#ifdef CONFIG_UNIX_MULTICAST + unix_release_mcast(sk); +#endif sock_orphan(sk); sk->shutdown = SHUTDOWN_MASK; dentry = u->dentry; @@ -509,6 +589,10 @@ init_MUTEX(&u->readsem); /* single task reading lock */ init_waitqueue_head(&u->peer_wait); unix_insert_socket(&unix_sockets_unbound, sk); +#ifdef CONFIG_UNIX_MULTICAST + u->is_mcast_addr = 0; + u->mcastnode = NULL; +#endif return sk; } @@ -1204,6 +1288,13 @@ unsigned hash; struct sk_buff *skb; long timeo; +#ifdef CONFIG_UNIX_MULTICAST + struct unix_sock *o; + struct unix_mcast *node=NULL; + unix_socket *addr=NULL; + int sentmsgs=0; + struct sk_buff *dupskb=NULL; +#endif struct scm_cookie tmp_scm; if (NULL == siocb->scm) @@ -1262,10 +1353,11 @@ goto out_free; } +mcastrestart: unix_state_rlock(other); err = -EPERM; if (!unix_may_send(sk, other)) - goto out_unlock; + goto mcast_out_unlock; if (test_bit(SOCK_DEAD, &other->flags)) { /* @@ -1290,48 +1382,143 @@ other = NULL; if (err) - goto out_free; + goto mcast_out_unlock; +#ifdef CONFIG_UNIX_MULTICAST + if (addr!=NULL) + goto mcast_out_unlock; +#endif goto restart; } err = -EPIPE; if (other->shutdown&RCV_SHUTDOWN) - goto out_unlock; + goto mcast_out_unlock; err = security_unix_may_send(sk->socket, other->socket); if (err) - goto out_unlock; + goto mcast_out_unlock; if (unix_peer(other) != sk && skb_queue_len(&other->receive_queue) > other->max_ack_backlog) { if (!timeo) { err = -EAGAIN; - goto out_unlock; + printk("unable to send to socket\n"); + goto mcast_out_unlock; } timeo = unix_wait_for_peer(other, timeo); + other=NULL; err = sock_intr_errno(timeo); if (signal_pending(current)) goto out_free; +#ifdef CONFIG_UNIX_MULTICAST + if (addr!=NULL) + goto mcast_out_unlock; +#endif goto restart; } + +#ifdef CONFIG_UNIX_MULTICAST + //works but could be better. for multicast we hit two conditionals for each time through + o=unix_sk(other); + if (o->mcastnode) { + if ((addr==NULL) && (o->is_mcast_addr)) { + //printk("setting up initial real dest\n"); + addr=other; + node=o->mcastnode; + if (node!=NULL) { + other=node->listener; + //printk("going back to mcastrestart\n"); + goto mcastrestart; + } else { + err=-ECONNREFUSED; + goto out_unlock; + } + } + if (node->nextlistener != NULL) { + //printk("duping skb\n"); + dupskb=skb_clone(skb,GFP_ATOMIC); + + //note: does atomic_add(skb->truesize, &sk->wmem_alloc); + //do we want to charge the sender for the skb? + skb_set_owner_w(dupskb, sk); + + } + } +#endif + + //if (addr!=NULL) + //printk("queueing skb\n"); skb_queue_tail(&other->receive_queue, skb); unix_state_runlock(other); other->data_ready(other, len); + +#ifdef CONFIG_UNIX_MULTICAST + if (addr!=NULL) { + sentmsgs++; + //printk("incrementing sentmsgs\n"); + + if (dupskb!=NULL) { + node=node->nextlistener; + other=node->listener; + skb=dupskb; + dupskb=NULL; + //printk("setting skb to dup, going to next listener, back to mcastrestart\n"); + goto mcastrestart; + } + other=addr; + unix_state_runlock(other); + //printk("unlocking real address, putting other, and returning len\n"); + } +#endif sock_put(other); scm_destroy(siocb->scm); return len; +mcast_out_unlock: +#ifdef CONFIG_UNIX_MULTICAST + //something bad happened, were unable to send to a final destination + if (addr!=NULL) { + //printk("handling error\n"); + if (other) { + //printk("unlocking real address\n"); + unix_state_runlock(other); + } + //we are sending to a multicast address + if (node->nextlistener != NULL) { + //if there are any more listeners, keep going. + node=node->nextlistener; + other=node->listener; + //printk("going to next listener, back to mcastrestart\n"); + goto mcastrestart; + } else { + //oops, no more listeners. If any listeners got it treat is + //as successful + //printk("setting other to addr\n"); + other=addr; + if (sentmsgs) { + //printk("setting err to len\n"); + err=len; + } + } + } +#endif + out_unlock: - unix_state_runlock(other); + if (other) { + //printk("unlocking fake address\n"); + unix_state_runlock(other); + } out_free: kfree_skb(skb); out: - if (other) + if (other) { + //printk("putting fake address and returning err\n"); sock_put(other); + } scm_destroy(siocb->scm); return err; } @@ -1883,6 +2070,230 @@ } #endif +#ifdef CONFIG_UNIX_MULTICAST +static int unix_mc_attach(unix_socket *sk , int optlen, struct sockaddr_un *mreq) +{ + int err=0; + struct unix_sock *u = unix_sk(sk); + struct unix_mcast *node; + unix_socket *other; + struct socket *newsocket; + struct sockaddr_un *sunaddr; + int namelen; + unsigned hash; + + //now see if the address we're trying to join already has a socket + sunaddr=mreq; + err = unix_mkname(sunaddr, optlen, &hash); + if (err < 0) + return err; + + namelen = err; + + other = unix_find_other(sunaddr, namelen, SOCK_DGRAM, hash, &err); + if (other==NULL) { + //allocate a socket for the listening address + err=sock_create(AF_UNIX, SOCK_DGRAM, 0, &newsocket); + if (err<0) + return err; + + //printk("created multicast socket at %p\n",newsocket); + + //okay, have to set up a new multicast destination socket + err = newsocket->ops->bind(newsocket,(struct sockaddr*) sunaddr, optlen); + if (err<0) + goto release_out; + + other=newsocket->sk; + unix_state_wlock(other); + unix_sk(other)->mcastnode=NULL; + unix_sk(other)->is_mcast_addr=1; + unix_state_wunlock(other); + } else { + //if the address exists but isn't a multicast address, we can't attach to it + if (!unix_sk(other)->is_mcast_addr) + return -EADDRINUSE; + } + + //try and allocate a multicast node + node=(struct unix_mcast *)kmalloc(sizeof(struct unix_mcast), GFP_KERNEL); + if (!node) { + err = -ENOMEM; + goto release_out; + } + + //printk("allocated multicast node at %p\n",node); + + //now set up the multicast node + //this node sits on two linked lists, one for the multicast address + //containing nodes pointing to all the sockets associated with the address, + //and one for each userspace socket containing nodes pointing to all the + //multicast addresses that the userspace socket is listening to + + //take holds on both sockets for the node references + sock_hold(sk); + sock_hold(other); + + node->listener = sk; + node->addr = other; + + unix_state_wlock(sk); + unix_state_wlock(other); + + //insert node at head of list from other + node->nextlistener = unix_sk(other)->mcastnode; + node->prevlistener = NULL; + unix_sk(other)->mcastnode = node; + if (node->nextlistener!=NULL) + node->nextlistener->prevlistener = node; + + //insert node at head of list from sk + node->nextaddr = u->mcastnode; + node->prevaddr = NULL; + u->mcastnode = node; + if (node->nextaddr!=NULL) + node->nextaddr->prevaddr = node; + + unix_state_wunlock(other); + unix_state_wunlock(sk); + + return 0; + +release_out: + //printk("releasing socket at %p\n",newsocket); + sock_release(newsocket); + + return err; +} + +static int unix_mc_detach(unix_socket *sk , int optlen, struct sockaddr_un *mreq) +{ + int err=0; + struct unix_mcast *node; + struct socket *releasesock=NULL; + unix_socket *other; + struct unix_sock *o; + struct sockaddr_un *sunaddr; + int namelen; + unsigned hash; + + //try and find the socket belonging to the address + sunaddr=mreq; + err = unix_mkname(sunaddr, optlen, &hash); + if (err < 0) + goto out; + namelen = err; + + other = unix_find_other(sunaddr, namelen, SOCK_DGRAM, hash, &err); + o=unix_sk(other); + + if (other==NULL) { + //strange, trying to leave a group that doesn't exist. + //should probably log it + return 0; + } else { + //if the address exists but isn't a multicast address, we can't detach from it + if (!o->is_mcast_addr) { + err=-ENOENT; + goto out; + } + } + + + unix_state_wlock(other); + unix_state_wlock(sk); + + node = o->mcastnode; + + while(node) + { + if (node->listener == sk) + break; + node=node->nextlistener; + } + + + if (node->listener != sk) { + //not actually a group member + err=-EINVAL; + goto out; + } + + unlink_mcast_node(node); + + if (o->mcastnode==NULL) + //can I call sock_release here with the locks held since I've got + //a refcount on other here? + releasesock=o->sk.socket; + + unix_state_wunlock(sk); + unix_state_wunlock(other); + + //give up refcounts since we're getting rid of the node + sock_put(sk); + sock_put(other); + + kfree(node); + + if (releasesock) + sock_release(releasesock); + +out: + return err; +} +#endif + + +static int unix_setsockopt(struct socket *sock, int level, int optname, + char *optval, int optlen) +{ +#ifndef CONFIG_UNIX_MULTICAST + return -EOPNOTSUPP; +#else + int err=0; + struct sock *sk=sock->sk; + lock_sock(sk); + + if (sk->type != SOCK_DGRAM) + goto e_inval; + + switch (optname) { + case UNIX_ADD_MEMBERSHIP: + case UNIX_DROP_MEMBERSHIP: + { + struct sockaddr_un mreq; + + if (optlen > sizeof(struct sockaddr_un)) + goto e_inval; + err = -EFAULT; + + memset(&mreq, 0, sizeof(mreq)); + if (copy_from_user(&mreq,optval,optlen)) + break; + + if (optname == UNIX_ADD_MEMBERSHIP) + err = unix_mc_attach(sk,optlen,&mreq); + else + err = unix_mc_detach(sk,optlen,&mreq); + break; + } + default: + err = -ENOPROTOOPT; + break; + } + release_sock(sk); + return err; + +e_inval: + release_sock(sk); + return -EINVAL; + +#endif +} + + + + struct proto_ops unix_stream_ops = { .family = PF_UNIX, @@ -1917,7 +2328,7 @@ .ioctl = unix_ioctl, .listen = sock_no_listen, .shutdown = unix_shutdown, - .setsockopt = sock_no_setsockopt, + .setsockopt = unix_setsockopt, .getsockopt = sock_no_getsockopt, .sendmsg = unix_dgram_sendmsg, .recvmsg = unix_dgram_recvmsg, --------------020204070109060409000206-- From jsd@monmouth.com Thu Mar 27 03:14:49 2003 Received: with ECARTIS (v1.0.0; list netdev); Thu, 27 Mar 2003 03:14:57 -0800 (PST) Received: from av8n.net (pcp03191463pcs.midltn01.nj.comcast.net [68.37.175.11]) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h2RBE4q9016301 for ; Thu, 27 Mar 2003 03:14:48 -0800 Received: (qmail 30434 invoked from network); 27 Mar 2003 11:13:59 -0000 Received: from localhost (HELO monmouth.com) (127.0.0.1) by localhost with SMTP; 27 Mar 2003 11:13:59 -0000 Message-ID: <3E82DCF7.7090706@monmouth.com> Date: Thu, 27 Mar 2003 06:13:59 -0500 From: "John S. Denker" User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.3) Gecko/20030323 X-Accept-Language: en-us, en MIME-Version: 1.0 To: netdev Subject: ?completeness of IPsec feature-set Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) X-archive-position: 2059 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jsd@monmouth.com Precedence: bulk X-list: netdev Content-Length: 1193 Lines: 35 Hi -- I've been unable to find much discussion of what IPsec features should be built into 2.5 / 2.6 to ensure a reasonable level of usability and scalability. For example, consider the challenge of establishing an ordinary VPN where N-1 of the gateways have changeable wild-side IP addresses. AFAICT nobody knows how to get racoon to do this. People were hoping that the new IPsec implementation would be a step forward. If it can't support road warriors it might be considered a step backwards. Mr. Atkins recently offered to look into the road-warrior issue in particular ... http://lists.freeswan.org/pipermail/design/2003-March/004575.html ... but the overall question remains: What has been done to ensure completeness and coherence of the design in general? Is there a document somewhere listing the set of desirable features and the status thereof? If not, it's high time to create one. If you want to know what sort of features I'm talking about, please see http://www.monmouth.com/~jsd/vpn/ipsec+routing/feature-list.htm Some of the listed features are obvious and already implemented or at least promised. But others may be less obvious and their status is not clear. From shmulik.hen@intel.com Thu Mar 27 05:32:49 2003 Received: with ECARTIS (v1.0.0; list netdev); Thu, 27 Mar 2003 05:32:54 -0800 (PST) Received: from hermes.fm.intel.com (fmr01.intel.com [192.55.52.18]) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h2RDW9q9023395 for ; Thu, 27 Mar 2003 05:32:49 -0800 Received: from petasus.fm.intel.com (petasus.fm.intel.com [10.1.192.37]) by hermes.fm.intel.com (8.11.6/8.11.6/d: outer.mc,v 1.51 2002/09/23 20:43:23 dmccart Exp $) with ESMTP id h2RDSZ800681 for ; Thu, 27 Mar 2003 13:28:35 GMT Received: from fmsmsxv040-1.fm.intel.com (fmsmsxvs040.fm.intel.com [132.233.42.124]) by petasus.fm.intel.com (8.11.6/8.11.6/d: inner.mc,v 1.28 2003/01/13 19:44:39 dmccart Exp $) with SMTP id h2RDQLL04533 for ; Thu, 27 Mar 2003 13:26:21 GMT Received: from jrslxjul4.npdj.intel.com ([10.12.254.188]) by fmsmsxv040-1.fm.intel.com (NAVGW 2.5.2.11) with SMTP id M2003032705321725945 ; Thu, 27 Mar 2003 05:32:19 -0800 Date: Thu, 27 Mar 2003 15:32:02 +0200 (IST) From: shmulik.hen@intel.com X-X-Sender: hshmulik@jrslxjul4.npdj.intel.com Reply-To: shmulik.hen@intel.com To: Dan Eble , bond-devel , bond-announce , linux-netdev , linux-kernel , linux-net Subject: Re: BUG or not? GFP_KERNEL with interrupts disabled. In-Reply-To: Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=ISO-8859-1 Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from QUOTED-PRINTABLE to 8bit by oss.sgi.com id h2RDW9q9023395 X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) X-archive-position: 2060 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: shmulik.hen@intel.com Precedence: bulk X-list: netdev Content-Length: 4332 Lines: 103 Further more, holding a lock_irq doesn't mean bottom halves are disabled too, it just means interrupts are disabled and no *new* softirq can be queued. Consider the following situation: In bond_release() we hold write_lock_irqsave(&bond->lock, flags) and then do all the releasing stuff. If, for example, we need to call dev_mc_upload() for the released slave, the following will happen spin_lock_bh(&dev->xmit_lock); __dev_mc_upload(dev); spin_unlock_bh(&dev->xmit_lock); spin_unlock_bh() calls local_bh_enable() which checks local_bh_count. If local_bh_count reaches zero (and it does), it directly executes do_softirq(). The check for in_interrupt() in do_softirq() is false and the softirqs that were queued begin to run and process the Tx and Rx backlogs. dev_queue_xmit() is called on the bond device which calls, lets say, bond_xmit_xor(). The first thing bond_xmit_xor() does is try to grab read_lock_irqsave(&bond->lock, flags). Since this lock is already held by bond_release(), and we're on the same cpu without any context switch, we've got ourselves a deadlock. This actually happened to us and it took us a while to figure the system halt, but we've got the kdb trace to prove it. Specifically for bonding, as stated by Dan below, it is indeed not necessary to hold a lock_irq in every entry point in the driver. From our experience in previous projects, we discovered that it is sufficient to just grab a read_lock when accessing the slaves list in any softirq level function (receive, transmit and timer), and hold a write_lock_bh() only when changing the slaves list in ioctl calls like bond_enslave(), bond_release(), bond_release_all() which all run at user context. We have created a version that uses the above scheme that is being tested by our QA group these days. Such a major change in the locking scheme requires allot of testing to try and detect potential hidden bugs and corner cases. We expect this will also increase the total throughput, since interrupts won't be blocked each time a packet is being transmitted or the miimon timer pops. We believe we will be able to post the patch (+results) next week. On Tue, 25 Mar 2003, Dan Eble wrote: > > (kernel is ppc 2.4.21-pre4) > > In bond_enslave() [drivers/net/bonding.c]: > >         write_lock_irqsave(&bond->lock, flags); >         ... >         err = netdev_set_master(slave_dev, master_dev); >         ... >         write_unlock_irqrestore(&bond->lock, flags); > > In netdev_set_master() [net/core/dev.c]: > >         rtmsg_ifinfo(RTM_NEWLINK, slave, IFF_SLAVE); > > In rtmsg_ifinfo() [net/core/rtnetlink.c]: > >         skb = alloc_skb(size, GFP_KERNEL); >         ... >         netlink_broadcast(rtnl, skb, 0, RTMGRP_LINK, GFP_KERNEL); > > Doesn't this admit the possibility of sleeping with interrupts disabled?  > I found it because I'm working on a driver that uses a master-slave > relationship like the bonding driver, and decided I didn't really need to > disable interrupts, so I tried using write_lock_bh()  instead.  The > result > was an "alloc_skb called nonatomically from interrupt" message because > write_lock_bh() increments the local BH count (which seems reasonable). > > A bigger question: Why are the IRQ check and the BH check inconsistent? > That is, local_bh_count() says "yes" if you are currently running in BH > context OR have disabled BHs; however, local_irq_count() says "yes" if > you > are currently running in interrupt context, but it says nothing (as far > as > I have seen) about whether IRQs are enabled or disabled.  Is this (a) the > Right Way, (b) something that's more trouble to fix than to be burned-by > once and then avoid for the rest of your life, or (c) totally horked? > > -- > Dan Eble   _____  . >                            |  _  |/| > Applied Innovation Inc.    | |_| | | > http://www.aiinet.com/     |__/|_|_| > > - > To unsubscribe from this list: send the line "unsubscribe linux-net" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at  http://vger.kernel.org/majordomo-info.html > > > -- | Shmulik Hen | | Israel Design Center (Jerusalem) | | LAN Access Division | | Intel Communications Group, Intel corp. | From ahu@outpost.ds9a.nl Thu Mar 27 05:37:23 2003 Received: with ECARTIS (v1.0.0; list netdev); Thu, 27 Mar 2003 05:37:28 -0800 (PST) Received: from outpost.ds9a.nl (postfix@outpost.ds9a.nl [213.244.168.210]) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h2RDb1q9023751 for ; Thu, 27 Mar 2003 05:37:22 -0800 Received: by outpost.ds9a.nl (Postfix, from userid 1000) id D84174017; Thu, 27 Mar 2003 14:36:59 +0100 (CET) Date: Thu, 27 Mar 2003 14:36:59 +0100 From: bert hubert To: "John S. Denker" Cc: netdev Subject: Re: ?completeness of IPsec feature-set Message-ID: <20030327133659.GA11820@outpost.ds9a.nl> Mail-Followup-To: bert hubert , "John S. Denker" , netdev References: <3E82DCF7.7090706@monmouth.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <3E82DCF7.7090706@monmouth.com> User-Agent: Mutt/1.3.28i X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) X-archive-position: 2061 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: ahu@ds9a.nl Precedence: bulk X-list: netdev Content-Length: 623 Lines: 18 On Thu, Mar 27, 2003 at 06:13:59AM -0500, John S. Denker wrote: > For example, consider the challenge of establishing an > ordinary VPN where N-1 of the gateways have changeable > wild-side IP addresses. AFAICT nobody knows how to get > racoon to do this. Racoon is just an IKE daemon - Linux is not bound to it. The OpenBSD one (isakpmd) also works under linux. You are free to write your own. Regards, bert -- http://www.PowerDNS.com Open source, database driven DNS Software http://lartc.org Linux Advanced Routing & Traffic Control HOWTO http://netherlabs.nl Consulting From davem@redhat.com Thu Mar 27 05:47:52 2003 Received: with ECARTIS (v1.0.0; list netdev); Thu, 27 Mar 2003 05:47:57 -0800 (PST) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h2RDlBq9024209 for ; Thu, 27 Mar 2003 05:47:52 -0800 Received: from localhost (IDENT:davem@localhost.localdomain [127.0.0.1]) by pizda.ninka.net (8.9.3/8.9.3) with ESMTP id FAA03335; Thu, 27 Mar 2003 05:43:58 -0800 Date: Thu, 27 Mar 2003 05:43:57 -0800 (PST) Message-Id: <20030327.054357.17283294.davem@redhat.com> To: shmulik.hen@intel.com Cc: dane@aiinet.com, bonding-devel@lists.sourceforge.net, bonding-announce@lists.sourceforge.net, netdev@oss.sgi.com, linux-kernel@vger.kernel.org, linux-net@vger.kernel.org, torvalds@transmeta.com, mingo@redhat.com, kuznet@ms2.inr.ac.ru Subject: Re: BUG or not? GFP_KERNEL with interrupts disabled. From: "David S. Miller" In-Reply-To: References: X-FalunGong: Information control. X-Mailer: Mew version 2.1 on Emacs 21.1 / Mule 5.0 (SAKAKI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) X-archive-position: 2062 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev Content-Length: 534 Lines: 14 From: shmulik.hen@intel.com Date: Thu, 27 Mar 2003 15:32:02 +0200 (IST) Further more, holding a lock_irq doesn't mean bottom halves are disabled too, it just means interrupts are disabled and no *new* softirq can be queued. Consider the following situation: I think local_bh_enable() should check irqs_disabled() and honour that. What you are showing here, that BH's can run via local_bh_enable() even when IRQs are disabled, is a BUG(). IRQ disabling is meant to be stronger than softint disabling. Ingo/Linus? From davem@redhat.com Thu Mar 27 06:16:30 2003 Received: with ECARTIS (v1.0.0; list netdev); Thu, 27 Mar 2003 06:16:33 -0800 (PST) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h2REFnq9024820 for ; Thu, 27 Mar 2003 06:16:30 -0800 Received: from localhost (IDENT:davem@localhost.localdomain [127.0.0.1]) by pizda.ninka.net (8.9.3/8.9.3) with ESMTP id GAA03464; Thu, 27 Mar 2003 06:12:42 -0800 Date: Thu, 27 Mar 2003 06:12:41 -0800 (PST) Message-Id: <20030327.061241.105170741.davem@redhat.com> To: trond.myklebust@fys.uio.no Cc: shmulik.hen@intel.com, dane@aiinet.com, bonding-devel@lists.sourceforge.net, bonding-announce@lists.sourceforge.net, netdev@oss.sgi.com, linux-kernel@vger.kernel.org, linux-net@vger.kernel.org, torvalds@transmeta.com, mingo@redhat.com, kuznet@ms2.inr.ac.ru Subject: Re: BUG or not? GFP_KERNEL with interrupts disabled. From: "David S. Miller" In-Reply-To: References: <20030327.054357.17283294.davem@redhat.com> X-FalunGong: Information control. X-Mailer: Mew version 2.1 on Emacs 21.1 / Mule 5.0 (SAKAKI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) X-archive-position: 2063 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev Content-Length: 524 Lines: 14 From: Trond Myklebust Date: 27 Mar 2003 15:11:56 +0100 > IRQ disabling is meant to be stronger than softint disabling. In that case, you'll need to have things like spin_lock_irqrestore() call local_bh_enable() in order to run the pending softirqs. Is that worth the trouble? "trouble" is a weird word to use when the current behavior is just wrong. :-) My point is that it doesn't matter what the fix is, running softints while hw IRQs are disabled must be fixed. From shmulik.hen@intel.com Thu Mar 27 07:39:02 2003 Received: with ECARTIS (v1.0.0; list netdev); Thu, 27 Mar 2003 07:39:10 -0800 (PST) Received: from hermes.fm.intel.com (fmr01.intel.com [192.55.52.18]) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h2RFcLq9026393 for ; Thu, 27 Mar 2003 07:39:01 -0800 Received: from petasus.fm.intel.com (petasus.fm.intel.com [10.1.192.37]) by hermes.fm.intel.com (8.11.6/8.11.6/d: outer.mc,v 1.51 2002/09/23 20:43:23 dmccart Exp $) with ESMTP id h2RFYb803210 for ; Thu, 27 Mar 2003 15:34:41 GMT Received: from fmsmsxv040-1.fm.intel.com (fmsmsxvs040.fm.intel.com [132.233.42.124]) by petasus.fm.intel.com (8.11.6/8.11.6/d: inner.mc,v 1.28 2003/01/13 19:44:39 dmccart Exp $) with SMTP id h2RFWLL26033 for ; Thu, 27 Mar 2003 15:32:21 GMT Received: from jrslxjul4.npdj.intel.com ([10.12.254.188]) by fmsmsxv040-1.fm.intel.com (NAVGW 2.5.2.11) with SMTP id M2003032707381712030 ; Thu, 27 Mar 2003 07:38:19 -0800 Date: Thu, 27 Mar 2003 17:38:02 +0200 (IST) From: shmulik.hen@intel.com X-X-Sender: hshmulik@jrslxjul4.npdj.intel.com Reply-To: shmulik.hen@intel.com To: bond-devel , bond-announce , linux-net , linux-netdev , linux-kernel , Jeff Garzik Subject: [Bonding][patch] Adding Transmit load balancing mode to bonding Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) X-archive-position: 2064 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: shmulik.hen@intel.com Precedence: bulk X-list: netdev Content-Length: 44927 Lines: 1395 This patch adds support for Transmit Load Balancing policy. This mode provides load sharing without any special support or configuration from the switch (Ether-Channel, etc.). Every active slave with the highest speed in the bond transmits (using its unique hw address) while the current primary slave receives. If the "receiving" slave fails, another slave takes over the MAC address of the failed receiving slave in order not to confuse the switch. Balancing is connection oriented (e.g. by IPv4 destination address) so packet order is always kept. Traffic is rebalanced periodically (every 10 sec by default) while taking into account previous load on each slave. Some types of protocols (e.g. ARP) are always sent through the current slave. This patch is against bonding 2.4.20-20030317 and relies on the application of the previous set of 9 patches (submitted on March 20th) because they contain all the features required for this mode's operation. diff -Nuarp linux-2.4.20-bonding-20030317/Documentation/networking/bonding.txt linux-2.4.20-bonding-20030317-devel/Documentation/networking/bonding.txt --- linux-2.4.20-bonding-20030317/Documentation/networking/bonding.txt 2003-03-27 17:00:05.000000000 +0200 +++ linux-2.4.20-bonding-20030317-devel/Documentation/networking/bonding.txt 2003-03-27 17:00:06.000000000 +0200 @@ -209,9 +209,9 @@ max_bonds mode - Specifies one of four bonding policies. The default is -round-robin (balance-rr). Possible values are (you can use either the -text or numeric option): + Specifies one of four bonding policies. The default is + round-robin (balance-rr). Possible values are (you can use + either the text or numeric option): balance-rr or 0 Round-robin policy: Transmit in a sequential order @@ -226,7 +226,7 @@ text or numeric option): to avoid confusing the switch. This mode provides fault tolerance. - balance-xor or 2 + balance-xor or 2 XOR policy: Transmit based on [(source MAC address XOR'd with destination MAC address) modula slave count]. This selects the same slave for each @@ -237,10 +237,22 @@ text or numeric option): Broadcast policy: transmits everything on all slave interfaces. This mode provides fault tolerance. - 802.3ad or 4 - IEEE 802.3ad Dynamic link aggregation. Creates aggregation - groups that share the same speed and duplex settings. - Transmits and receives on all slaves in the active aggregator. + 802.3ad or 4 + IEEE 802.3ad Dynamic link aggregation. Creates aggregation + groups that share the same speed and duplex settings. + Transmits and receives on all slaves in the active aggregator. + This mode requires Ethtool support in the base drivers for + retrieving the speed of each slave. + + tlb or 5 + Adaptive transmit load balancing: channel bonding that does + not require any special switch support. The outgoing traffic + is distributed according to the current load (computed relative + to the speed) on each slave. Incoming traffic is received by + the current slave. If the receiving slave fails, another slave + takes over the MAC address of the failed receiving slave. + This mode requires Ethtool support in the base drivers for + retrieving the speed of each slave. miimon @@ -550,7 +562,7 @@ Frequently Asked Questions * Cisco 6500 series (look for lacp). * Foundry Big Iron 4000 - In active-backup mode, it should work with any Layer-II switche. + In active-backup and tlb modes, it should work with any Layer-II switch. 8. Where does a bonding device get its MAC address from? @@ -605,7 +617,12 @@ Frequently Asked Questions 802.3ad, based on XOR but distributes traffic among all interfaces in the active aggregator. - + + Transmit load balancing balances the traffic according to the + current load on each slave. The balancing is clients based and the + least loaded is slave is selected for a new client. The load of each + slave is calculated relative to its speed and enables load balancing + in mixed speed teams. High Availability ================= @@ -841,10 +858,6 @@ The main limitations are : Use the arp_interval/arp_ip_target parameters to count incoming/outgoing frames. - - A Transmit Load Balancing policy is not currently available. This mode - allows every slave in the bond to transmit while only one receives. If - the "receiving" slave fails, another slave takes over the MAC address of - the failed receiving slave. Resources and Links diff -Nuarp linux-2.4.20-bonding-20030317/drivers/net/bonding/bond_alb.c linux-2.4.20-bonding-20030317-devel/drivers/net/bonding/bond_alb.c --- linux-2.4.20-bonding-20030317/drivers/net/bonding/bond_alb.c 1970-01-01 02:00:00.000000000 +0200 +++ linux-2.4.20-bonding-20030317-devel/drivers/net/bonding/bond_alb.c 2003-03-27 17:00:06.000000000 +0200 @@ -0,0 +1,709 @@ +/**************************************************************************** + Copyright(c) 1999 - 2003 Intel Corporation. All rights reserved. + + This program is free software; you can redistribute it and/or modify it + under the terms of the GNU General Public License as published by the Free + Software Foundation; either version 2 of the License, or (at your option) + any later version. + + This program is distributed in the hope that it will be useful, but WITHOUT + ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or + FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for + more details. + + You should have received a copy of the GNU General Public License along with + this program; if not, write to the Free Software Foundation, Inc., 59 + Temple Place - Suite 330, Boston, MA 02111-1307, USA. + + The full GNU General Public License is included in this distribution in the + file called LICENSE. +*****************************************************************************/ + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include "bonding.h" +#include "bond_alb.h" + + +#define ALB_TIMER_TICKS_PER_SEC 10 //should be a divisor of HZ +#define BOND_TLB_REBALANCE_INTERVAL 10 //seconds (used for division - never set to zero !!!) +#define BOND_ALB_LP_INTERVAL 1 //seconds + +#define BOND_TLB_REBALANCE_TICKS_INTERVAL (BOND_TLB_REBALANCE_INTERVAL*ALB_TIMER_TICKS_PER_SEC) +#define BOND_ALB_LP_TICKS_INTERVAL (BOND_ALB_LP_INTERVAL*ALB_TIMER_TICKS_PER_SEC) + +#define TLB_HASH_TABLE_SIZE 256 // The size of the clients hash table. + // Note that this value MUST NOT be smaller + // because the key to the hash table is of BYTE! + + +#define TLB_NULL_INDEX 0xffffffff +#define MAX_LP_RETRY 3 + +#pragma pack(1) +struct learning_pkt { + u8 mac_dst[ETH_ALEN]; + u8 mac_src[ETH_ALEN]; + u16 type; + u8 padding[ETH_ZLEN - (2*ETH_ALEN + 2)]; +}; + +#pragma pack() + +static void alb_send_learning_packets(struct slave *slave, u8 mac_addr[]); +static void alb_set_mac_addr(struct slave *slave, u8 addr[]); +static void alb_swap_mac_addr(struct bonding *bond, struct slave *slave1, struct slave *slave2); +static void alb_change_hw_addr_on_detach(struct bonding *bond, struct slave *slave); +static void alb_handle_addr_collision_on_attach(struct bonding *bond, struct slave *slave); +static struct slave* tlb_choose_channel(struct bonding *bond, u32 hash_index, u32 skb_len); +static struct slave* tlb_get_least_loaded_slave(struct bonding *bond); + +static inline void _lock_tx_hashtbl(struct bonding *bond) +{ + spin_lock(&(BOND_ALB_INFO(bond).tx_hashtbl_lock)); +} + +static inline void _unlock_tx_hashtbl(struct bonding *bond) +{ + spin_unlock(&(BOND_ALB_INFO(bond).tx_hashtbl_lock)); +} + +//Caller must hold tx_hashtbl lock +static inline void tlb_init_table_entry(struct bonding *bond, u8 index, u8 save_load) +{ + struct tlb_client_info *entry; + + if (BOND_ALB_INFO(bond).tx_hashtbl == NULL) { + return; + } + + entry = &(BOND_ALB_INFO(bond).tx_hashtbl[index]); + + if (save_load) { + entry->load_history = 1 + entry->tx_bytes / BOND_TLB_REBALANCE_INTERVAL; + } else { + entry->load_history = 1; + } + entry->tx_slave = NULL; + entry->tx_bytes = 0; + entry->next = TLB_NULL_INDEX; + entry->prev = TLB_NULL_INDEX; +} + +static inline void tlb_init_slave(struct slave *curr_slave, u8 save_load) +{ + struct tlb_slave_info *slave_info = &(SLAVE_TLB_INFO(curr_slave)); + + if (save_load) { + slave_info->load = slave_info->rx_bytes / BOND_TLB_REBALANCE_INTERVAL; + } else { + slave_info->load = 0; + } + slave_info->head = TLB_NULL_INDEX; + slave_info->rx_bytes = 0; +} + +//must be locked with the bond read lock +static inline void tlb_clear_slave(struct bonding *bond, struct slave *slave) +{ + struct tlb_client_info *tx_hash_table = NULL; + u32 index, next_index; + + if (!slave) { + return; + } + + //clear slave from tx_hashtbl + _lock_tx_hashtbl(bond); + tx_hash_table = BOND_ALB_INFO(bond).tx_hashtbl; + + if (tx_hash_table) { + index = SLAVE_TLB_INFO(slave).head; + while (index != TLB_NULL_INDEX) { + next_index = tx_hash_table[index].next; + tlb_init_table_entry(bond, index, 1); + index = next_index; + } + } + _unlock_tx_hashtbl(bond); + + tlb_init_slave(slave, 1); +} + +static inline u8 _simple_hash(u8 *hash_start, int hash_size) +{ + int i; + u8 hash = 0; + + for (i=0; itx_hashtbl_lock)); + + _lock_tx_hashtbl(bond); + if (bond_info->tx_hashtbl != NULL) { + printk (KERN_ERR "%s: TLB hash table is not NULL\n", bond->device->name); + _unlock_tx_hashtbl(bond); + return -1; + } + + size = TLB_HASH_TABLE_SIZE * sizeof(struct tlb_client_info); + bond_info->tx_hashtbl = kmalloc(size, GFP_KERNEL); + if (bond_info->tx_hashtbl == NULL) { + printk (KERN_ERR "%s: Failed to allocate TLB hash table\n", bond->device->name); + _unlock_tx_hashtbl(bond); + return -1; + } + + for (i=0; ilock, flags); + slave = bond->next; + while (slave && (slave != (struct slave *)bond)) { + tlb_init_slave(slave, 0); + slave = bond_get_next_slave(bond, slave); + } + read_unlock_irqrestore(&bond->lock, flags); + + return 0; +} + +int bond_alb_deinitialize(struct bonding *bond) +{ + struct alb_bond_info *bond_info = &(BOND_ALB_INFO(bond)); + + _lock_tx_hashtbl(bond); + if (bond_info->tx_hashtbl == NULL) { + _unlock_tx_hashtbl(bond); + return 0; + } + kfree(bond_info->tx_hashtbl); + bond_info->tx_hashtbl = NULL; + _unlock_tx_hashtbl(bond); + + return 0; +} + +int bond_alb_xmit(struct sk_buff *skb, struct net_device *dev) +{ + struct bonding *bond = (struct bonding *) dev->priv; + struct ethhdr *eth_data = (struct ethhdr *)skb->data; + struct alb_bond_info *bond_info = &(BOND_ALB_INFO(bond)); + struct slave *tx_slave = NULL; + unsigned long flags; + char do_tx_balance = 1; + int hash_size = 0; + u32 hash_index = 0; + u8 *hash_start = NULL; + u8 mac_bcast[ETH_ALEN] = {0xff,0xff,0xff,0xff,0xff,0xff}; + + if (!IS_UP(dev)) { /* bond down */ + dev_kfree_skb(skb); + return 0; + } + + //make sure that the current_slave and the slaves list do not change during tx + read_lock_irqsave(&bond->lock, flags); + + if (bond->slave_cnt == 0) { + /* no suitable interface, frame not sent */ + dev_kfree_skb(skb); + read_unlock_irqrestore(&bond->lock, flags); + return 0; + } + + read_lock(&bond->ptrlock); + + switch (ntohs(skb->protocol)) { + case ETH_P_IP: + if ((memcmp(eth_data->h_dest, mac_bcast, ETH_ALEN) == 0) || + (skb->nh.iph->daddr == 0xffffffff)) { + do_tx_balance = 0; + break; + } + hash_start = (char*)&(skb->nh.iph->daddr); + hash_size = 4; + break; + + case ETH_P_IPV6: + if (memcmp(eth_data->h_dest, mac_bcast, ETH_ALEN) == 0) { + do_tx_balance = 0; + break; + } + + hash_start = (char*)&(skb->nh.ipv6h->daddr); + hash_size = 16; + break; + + case ETH_P_IPX: + if (skb->nh.ipxh->ipx_checksum != __constant_htons(IPX_NO_CHECKSUM)) { + do_tx_balance = 0; + break; + } + + if (skb->nh.ipxh->ipx_type != __constant_htons(IPX_TYPE_NCP)) { + do_tx_balance = 0; + break; + } + + hash_start = (char*)eth_data->h_dest; + hash_size = ETH_ALEN; + break; + + default: + do_tx_balance = 0; + break; + } + + if (do_tx_balance) { + hash_index = _simple_hash(hash_start, hash_size); + tx_slave = tlb_choose_channel(bond, hash_index, skb->len); + } + + if (!tx_slave) { + //unbalanced or unassigned, send through primary + tx_slave = bond->current_slave; + bond_info->unbalanced_load += skb->len; + } + + if (tx_slave && SLAVE_IS_OK(tx_slave)) { + skb->dev = tx_slave->dev; + if (tx_slave != bond->current_slave) { + memcpy(eth_data->h_source, tx_slave->dev->dev_addr, ETH_ALEN); + } + dev_queue_xmit(skb); + } else { + // no suitable interface, frame not sent + if (tx_slave) { + tlb_clear_slave(bond, tx_slave); + } + dev_kfree_skb(skb); + //printk (KERN_ERR "no suitable channel found - freeing packet\n"); + } + + read_unlock(&bond->ptrlock); + read_unlock_irqrestore(&bond->lock, flags); + return 0; +} + +void bond_alb_monitor(struct bonding *bond) +{ + struct alb_bond_info *bond_info = &(BOND_ALB_INFO(bond)); + struct slave *slave = NULL; + unsigned long flags; + + read_lock_irqsave(&bond->lock, flags); + + if (bond->next == (struct slave*)bond) { + bond_info->tx_rebalance_counter = 0; + bond_info->lp_counter = 0; + goto out; + } + + bond_info->tx_rebalance_counter++; + bond_info->lp_counter++; + + //send learning packets + if (bond_info->lp_counter >= BOND_ALB_LP_TICKS_INTERVAL) { + slave = bond->next; + for (; slave && (slave != (struct slave *)bond); slave = bond_get_next_slave(bond, slave)) { + alb_send_learning_packets(slave, slave->dev->dev_addr); + } + bond_info->lp_counter = 0; + } + + //rebalance tx traffic + if (bond_info->tx_rebalance_counter >= BOND_TLB_REBALANCE_TICKS_INTERVAL) { + slave = bond->next; + for (; slave && (slave != (struct slave *)bond); slave = bond_get_next_slave(bond, slave)) { + tlb_clear_slave(bond, slave); + read_lock(&bond->ptrlock); + if (slave == bond->current_slave) { + SLAVE_TLB_INFO(slave).load = + bond_info->unbalanced_load / BOND_TLB_REBALANCE_INTERVAL; + bond_info->unbalanced_load = 0; + } + read_unlock(&bond->ptrlock); + } + bond_info->tx_rebalance_counter = 0; + } + +out: + read_unlock_irqrestore(&bond->lock, flags); + // re-arm the timer + mod_timer(&(bond_info->alb_timer), jiffies + (HZ/ALB_TIMER_TICKS_PER_SEC)); +} + +void bond_alb_init_slave(struct bonding *bond, struct slave *slave) +{ + alb_set_mac_addr(slave, slave->perm_hwaddr); + tlb_init_slave(slave, 0); + + if (bond->slave_cnt > 1) { + alb_handle_addr_collision_on_attach(bond, slave); + } + + //order a rebalance ASAP + bond->alb_info.tx_rebalance_counter = BOND_TLB_REBALANCE_TICKS_INTERVAL; +} + +//Must hold bond->lock for write +//slave has already been detached from the list +void bond_alb_deinit_slave(struct bonding *bond, struct slave *slave) +{ + if (bond->slave_cnt > 1) { + alb_change_hw_addr_on_detach(bond, slave); + } + + tlb_clear_slave(bond, slave); +} + +void bond_alb_handle_link_change(struct bonding *bond, struct slave *slave, char link) +{ + struct alb_bond_info *bond_info = &(BOND_ALB_INFO(bond)); + if (slave == (struct slave *)bond) { + return; + } + + if (link == BOND_LINK_FAIL) { + tlb_clear_slave(bond, slave); + } + + if (link == BOND_LINK_UP) { + //order a rebalance ASAP + bond_info->tx_rebalance_counter = BOND_TLB_REBALANCE_TICKS_INTERVAL; + } +} + +//Must hold write ptr lock +struct slave* bond_alb_handle_active_change(struct bonding *bond, struct slave *new_slave) +{ + struct slave *old_slave=NULL, *swap_slave=NULL; + + if (!new_slave || (bond->slave_cnt == 0)) { + return NULL; + } + + swap_slave = old_slave = bond->current_slave; + + if (old_slave == new_slave) { + return new_slave; + } + + if (!old_slave) { + //find slave that is holding the bond's mac address + swap_slave = bond->next; + for (; swap_slave; swap_slave = bond_get_next_slave(bond, swap_slave)) { + if (memcmp(swap_slave->dev->dev_addr, bond->device->dev_addr, ETH_ALEN) == 0) { + break; + } + } + } + + if (swap_slave) { + //swap mac address + alb_swap_mac_addr(bond, swap_slave, new_slave); + } else { + //set the new_slave to the bond mac address + alb_set_mac_addr(new_slave, bond->device->dev_addr); + //fasten bond mac on new current slave + alb_send_learning_packets(new_slave, bond->device->dev_addr); + } + + return new_slave; +} + +static void alb_send_learning_packets(struct slave *slave, u8 mac_addr[]) +{ + struct sk_buff *skb = NULL; + struct learning_pkt pkt; + char *data = NULL; + int i; + unsigned int size = sizeof(struct learning_pkt); + + memset(&pkt, 0, size); + memcpy(pkt.mac_dst, mac_addr, ETH_ALEN); + memcpy(pkt.mac_src, mac_addr, ETH_ALEN); + pkt.type = __constant_htons(ETH_P_LOOP); + + for (i=0; i < MAX_LP_RETRY; i++) { + skb = NULL; + skb = dev_alloc_skb(size); + if (!skb) { + return; + } + + data = skb_put(skb, size); + memcpy(data, &pkt, size); + skb->mac.raw = data; + skb->nh.raw = data + ETH_HLEN; + skb->protocol = pkt.type; + skb->priority = TC_PRIO_CONTROL; + skb->dev = slave->dev; + dev_queue_xmit(skb); + } +} + +static void alb_set_mac_addr(struct slave *slave, u8 addr[]) +{ + if (!slave) { + return; + } + + memcpy(slave->dev->dev_addr, addr, ETH_ALEN); +} + +//must be called under bond lock (must be sure that slaves cannot be removed during function exec) +//and current primary lock +static void alb_swap_mac_addr(struct bonding *bond, struct slave *slave1, struct slave *slave2) +{ + u8 tmp_mac_addr[ETH_ALEN]; + + if (!slave1 || !slave2) { + return; + } + + memcpy(tmp_mac_addr, slave1->dev->dev_addr, ETH_ALEN); + alb_set_mac_addr(slave1, slave2->dev->dev_addr); + alb_set_mac_addr(slave2, tmp_mac_addr); + + //fasten the change in the switch + if (SLAVE_IS_OK(slave1)) { + alb_send_learning_packets(slave1, slave1->dev->dev_addr); + } + if (SLAVE_IS_OK(slave2)) { + alb_send_learning_packets(slave2, slave2->dev->dev_addr); + } +} + +/** + * alb_change_hw_addr_on_detach + * @bond: bonding we're working on + * @slave: the slave that was just detached + * + * We assume that @slave was already detached from the slave list. + * + * If @slave's permanent hw address is different both from its current address + * and from @bond's address, then somewhere in the bond there's a slave that has + * @slave's permanet address as its current address. We'll make sure that that + * slave no longer uses @slave's permanent address. + * + * Assumes bond->lock is held for writing. + */ +static void alb_change_hw_addr_on_detach(struct bonding *bond, struct slave *slave) +{ + struct slave *tmp_slave; + int perm_curr_diff; + int perm_bond_diff; + + perm_curr_diff = memcmp(slave->perm_hwaddr, + slave->dev->dev_addr, + ETH_ALEN); + perm_bond_diff = memcmp(slave->perm_hwaddr, + bond->device->dev_addr, + ETH_ALEN); + if (perm_curr_diff && perm_bond_diff) { + tmp_slave = bond->next; + for (; tmp_slave; tmp_slave = bond_get_next_slave(bond, tmp_slave)) { + if (memcmp(slave->perm_hwaddr, + tmp_slave->dev->dev_addr, + ETH_ALEN) == 0) { + break; + } + } + if (tmp_slave) { + alb_set_mac_addr(tmp_slave, slave->dev->dev_addr); + alb_send_learning_packets(tmp_slave, slave->dev->dev_addr); + } + } +} + +/** + * alb_handle_addr_collision_on_attach + * @bond: bonding we're working on + * @slave: the slave that was just attached + * + * If the permanent hw address of @slave is @bond's hw address, we need to find + * a different hw address to give @slave, that isn't in use by any other slave + * in the bond. This address must be, of course, one of the premanent addresses + * of the other slaves. + * + * We go over the slave list, and for each slave there we compare its permanent + * hw address with the current address of all the other slaves. If no match was + * found, then we've found a slave with a permanent address that isn't used by + * any other slave in the bond, so we can assign it to @slave. + * + * Assumes bond->lock is held for writing. + */ +static void alb_handle_addr_collision_on_attach(struct bonding *bond, struct slave *slave) +{ + struct slave *tmp_slave1, *tmp_slave2; + + if (memcmp(slave->perm_hwaddr, bond->device->dev_addr, ETH_ALEN)) { + return; + } + + tmp_slave1 = bond->next; + for (; tmp_slave1; tmp_slave1 = bond_get_next_slave(bond, tmp_slave1)) { + if (tmp_slave1 == slave) { + continue; + } + + tmp_slave2 = bond->next; + for (; tmp_slave2; tmp_slave2 = bond_get_next_slave(bond, tmp_slave2)) { + if (tmp_slave2 == slave) { + continue; + } + + if (!memcmp(tmp_slave1->perm_hwaddr, + tmp_slave2->dev->dev_addr, + ETH_ALEN)) { + + break; + } + } + + if (!tmp_slave2) { + // no slave has tmp_slave1's perm addr as its curr addr + break; + } + } + + if (tmp_slave1) { + alb_set_mac_addr(slave, tmp_slave1->perm_hwaddr); + + printk(KERN_WARNING "bonding: Warning: the hw address " + "of slave %s is not unique; " + "giving it the hw address of %s\n", + slave->dev->name, tmp_slave1->dev->name); + } else { + printk(KERN_CRIT "bonding: Error: the hw address " + "of slave %s is not unique; " + "couldn't find a slave with a free hw " + "address to give it (this should not have " + "happened)\n", slave->dev->name); + printk(KERN_CRIT "Communications may become unstable " + "(confused switch). You might want to remove " + "this slave\n"); + } +} + +//Must hold bond->lock (read) +struct slave* tlb_choose_channel(struct bonding *bond, u32 hash_index, u32 skb_len) +{ + struct alb_bond_info *bond_info = &(BOND_ALB_INFO(bond)); + struct tlb_client_info *hash_table = NULL; + struct slave *assigned_slave = NULL; + + _lock_tx_hashtbl(bond); + + hash_table = bond_info->tx_hashtbl; + if (hash_table == NULL) { + printk (KERN_ERR "%s: TLB hash table is NULL\n", bond->device->name); + _unlock_tx_hashtbl(bond); + return NULL; + } + + assigned_slave = hash_table[hash_index].tx_slave; + if (!assigned_slave) { + assigned_slave = tlb_get_least_loaded_slave(bond); + + if (assigned_slave) { + struct tlb_slave_info *slave_info = &(SLAVE_TLB_INFO(assigned_slave)); + u32 next_index = slave_info->head; + + hash_table[hash_index].tx_slave = assigned_slave; + hash_table[hash_index].next = next_index; + hash_table[hash_index].prev = TLB_NULL_INDEX; + hash_table[hash_index].tx_bytes += skb_len; + + if (next_index != TLB_NULL_INDEX) { + hash_table[next_index].prev = hash_index; + } + + slave_info->head = hash_index; + slave_info->load += hash_table[hash_index].load_history; + } + } + + _unlock_tx_hashtbl(bond); + + return assigned_slave; +} + +//Must hold bond->lock +static struct slave* tlb_get_least_loaded_slave(struct bonding *bond) +{ + struct slave *slave = bond->next; + struct slave *least_loaded; + u32 curr_gap, max_gap; + + if (slave == (struct slave *)bond) { + return NULL; + } + + // Find the first enabled slave + while (slave && (slave != (struct slave *)bond)) { + if (SLAVE_IS_OK(slave)) { + break; + } + slave = bond_get_next_slave(bond, slave); + } + + if (!slave) { + return NULL; + } + + least_loaded = slave; + slave = bond_get_next_slave(bond, slave); + max_gap = (least_loaded->speed * 1000000) - + (SLAVE_TLB_INFO(least_loaded).load * 8); + + while (slave && (slave != (struct slave *)bond)) { + if (SLAVE_IS_OK(slave)) { + curr_gap = (slave->speed * 1000000) - + (SLAVE_TLB_INFO(slave).load * 8); + if (max_gap < curr_gap) { + least_loaded = slave; + max_gap = curr_gap; + } + } + slave = bond_get_next_slave(bond, slave); + } + + return least_loaded; +} + diff -Nuarp linux-2.4.20-bonding-20030317/drivers/net/bonding/bond_alb.h linux-2.4.20-bonding-20030317-devel/drivers/net/bonding/bond_alb.h --- linux-2.4.20-bonding-20030317/drivers/net/bonding/bond_alb.h 1970-01-01 02:00:00.000000000 +0200 +++ linux-2.4.20-bonding-20030317-devel/drivers/net/bonding/bond_alb.h 2003-03-27 17:00:06.000000000 +0200 @@ -0,0 +1,72 @@ +/**************************************************************************** + Copyright(c) 1999 - 2003 Intel Corporation. All rights reserved. + + This program is free software; you can redistribute it and/or modify it + under the terms of the GNU General Public License as published by the Free + Software Foundation; either version 2 of the License, or (at your option) + any later version. + + This program is distributed in the hope that it will be useful, but WITHOUT + ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or + FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for + more details. + + You should have received a copy of the GNU General Public License along with + this program; if not, write to the Free Software Foundation, Inc., 59 + Temple Place - Suite 330, Boston, MA 02111-1307, USA. + + The full GNU General Public License is included in this distribution in the + file called LICENSE. +*****************************************************************************/ + +#ifndef __BOND_ALB_H__ +#define __BOND_ALB_H__ + +#include + + +#define BOND_ALB_INFO(bond) ((bond)->alb_info) +#define SLAVE_TLB_INFO(slave) ((slave)->tlb_info) + +struct tlb_client_info { + struct slave *tx_slave; //A pointer to slave used for transmit packets to a Client + //that the Hash function gave this entry index. + u32 tx_bytes; //Each Client acumulates the BytesTx that were tranmitted to it and + //after each CallBack the LoadHistory gets it devided to the balanceinterval + u32 load_history; //This field contains the amount of Bytes that were transmitted to + //this client by the server on the previous balance interval in Bps. + u32 next; //The next Hash table entry index, assigned to use the same adapter + //for transmit. + u32 prev; //The previous Hash table entry index, assigned to use the same +}; + +struct tlb_slave_info { + u32 head; // Index to the head of the bi-directional clients + // hash table entries list. The entries in the list + // are the entries that were assigned to use this + // slave for transmit. + u32 rx_bytes; // The number of bytes received through this adapter. + u32 load; // Each slave sums the loadHistory of all clients assigned to it + u8 rlb_promisc; //slave is set to promiscuous if slave->dev->dev_addr != hw mac address +}; + +struct alb_bond_info { + struct timer_list alb_timer; + struct tlb_client_info *tx_hashtbl; //Dynamically allocated + spinlock_t tx_hashtbl_lock; + u32 unbalanced_load; + int tx_rebalance_counter; + int lp_counter; +}; + +int bond_alb_initialize(struct bonding *bond); +int bond_alb_deinitialize(struct bonding *bond); +int bond_alb_xmit(struct sk_buff *skb, struct net_device *dev); +void bond_alb_monitor(struct bonding *bond); +void bond_alb_init_slave(struct bonding *bond, struct slave *slave); +void bond_alb_deinit_slave(struct bonding *bond, struct slave *slave); +void bond_alb_handle_link_change(struct bonding *bond, struct slave *slave, char link); +struct slave* bond_alb_handle_active_change(struct bonding *bond, struct slave *new_slave); + +#endif //__BOND_ALB_H__ + diff -Nuarp linux-2.4.20-bonding-20030317/drivers/net/bonding/bonding.h linux-2.4.20-bonding-20030317-devel/drivers/net/bonding/bonding.h --- linux-2.4.20-bonding-20030317/drivers/net/bonding/bonding.h 2003-03-27 17:00:05.000000000 +0200 +++ linux-2.4.20-bonding-20030317-devel/drivers/net/bonding/bonding.h 2003-03-27 17:00:06.000000000 +0200 @@ -23,6 +23,7 @@ #include #include #include "bond_3ad.h" +#include "bond_alb.h" #ifdef BONDING_DEBUG @@ -67,6 +68,7 @@ typedef struct slave { u8 duplex; u8 perm_hwaddr[ETH_ALEN]; struct ad_slave_info ad_info; // HUGE struct. maybe alloc dynamically + struct tlb_slave_info tlb_info; } slave_t; /* @@ -99,6 +101,7 @@ typedef struct bonding { struct dev_mc_list *mc_list; unsigned short flags; struct ad_bond_info ad_info; + struct alb_bond_info alb_info; } bonding_t; void bond_set_slave_active_flags(slave_t *slave); diff -Nuarp linux-2.4.20-bonding-20030317/drivers/net/bonding/bond_main.c linux-2.4.20-bonding-20030317-devel/drivers/net/bonding/bond_main.c --- linux-2.4.20-bonding-20030317/drivers/net/bonding/bond_main.c 2003-03-27 17:00:05.000000000 +0200 +++ linux-2.4.20-bonding-20030317-devel/drivers/net/bonding/bond_main.c 2003-03-27 17:00:06.000000000 +0200 @@ -324,6 +324,14 @@ * Tsippy Mendelson and * Shmulik Hen * - Added support for IEEE 802.3ad Dynamic link aggregatoin mode. + * + * 2003/03/27 - Amir Noam , + * Tsippy Mendelson and + * Shmulik Hen + * - Added support for Transmit load balancing mode. + * - Concentrate all assignments of current_slave to a single point + * so specific modes can take actions when the primary adapter is + * changed. */ #include @@ -365,6 +373,7 @@ #include #include "bonding.h" #include "bond_3ad.h" +#include "bond_alb.h" #define DRV_VERSION "2.4.20-20030317" #define DRV_RELDATE "March 17, 2003" @@ -416,6 +425,7 @@ static struct bond_parm_tbl bond_mode_tb { "balance-xor", BOND_MODE_XOR}, { "broadcast", BOND_MODE_BROADCAST}, { "802.3ad", BOND_MODE_8023AD}, +{ "tlb", BOND_MODE_TLB}, { NULL, -1}, }; @@ -484,6 +494,19 @@ static int bond_sethwaddr(struct net_dev */ static int bond_get_info(char *buf, char **start, off_t offset, int length); +/* Caller must hold bond->ptrlock for write */ +static inline struct slave* +bond_assign_current_slave(struct bonding *bond,struct slave *newslave) +{ + if (bond_mode == BOND_MODE_TLB) { + bond->current_slave = bond_alb_handle_active_change(bond, newslave); + } else { + bond->current_slave = newslave; + } + + return bond->current_slave; +} + /* #define BONDING_DEBUG 1 */ /* several macros */ @@ -514,7 +537,9 @@ bond_mode_name(void) return "fault-tolerance (broadcast)"; case BOND_MODE_8023AD: return "IEEE 802.3ad Dynamic link aggregation"; - default : + case BOND_MODE_TLB: + return "transmit load balancing"; + default: return "unknown"; } } @@ -592,20 +617,11 @@ bond_detach_slave(bonding_t *bond, slave if (bond->next == slave) { /* is the slave at the head ? */ if (bond->prev == slave) { /* is the slave alone ? */ - write_lock(&bond->ptrlock); - bond->current_slave = NULL; /* no slave anymore */ - write_unlock(&bond->ptrlock); bond->prev = bond->next = (slave_t *)bond; } else { /* not alone */ bond->next = slave->next; slave->next->prev = (slave_t *)bond; bond->prev->next = slave->next; - - write_lock(&bond->ptrlock); - if (bond->current_slave == slave) { - bond->current_slave = slave->next; - } - write_unlock(&bond->ptrlock); } } else { slave->prev->next = slave->next; @@ -614,15 +630,16 @@ bond_detach_slave(bonding_t *bond, slave } else { slave->next->prev = slave->prev; } - - write_lock(&bond->ptrlock); - if (bond->current_slave == slave) { - bond->current_slave = slave->next; - } - write_unlock(&bond->ptrlock); } update_slave_cnt(bond, -1); + write_lock(&bond->ptrlock); + if (bond->next != (slave_t *)bond) { /* found one slave */ + bond_assign_current_slave(bond, bond->next); + } else { + bond_assign_current_slave(bond, NULL); + } + write_unlock(&bond->ptrlock); return slave; } @@ -878,6 +895,18 @@ static int bond_open(struct net_device * bond_register_lacpdu(bond); } + if (bond_mode == BOND_MODE_TLB) { + struct timer_list *alb_timer = &(BOND_ALB_INFO(bond).alb_timer); + if (bond_alb_initialize(bond)) { + return -1; + } + init_timer(alb_timer); + alb_timer->expires = jiffies + 1; + alb_timer->data = (unsigned long)bond; + alb_timer->function = (void *)&bond_alb_monitor; + add_timer(alb_timer); + } + return 0; } @@ -906,6 +935,12 @@ static int bond_close(struct net_device bond_unregister_lacpdu(bond); } + if (bond_mode == BOND_MODE_TLB) { + del_timer_sync(&(BOND_ALB_INFO(bond).alb_timer)); + + bond_alb_deinitialize(bond); + } + if (bond->next != (struct slave *) bond) { /* Release the bonded slaves */ bond_release_all(master); @@ -1384,7 +1419,7 @@ static int bond_enslave(struct net_devic #endif /* first slave or no active slave yet, and this link is OK, so make this interface the active one */ - bond->current_slave = new_slave; + bond_assign_current_slave(bond, new_slave); bond_set_slave_active_flags(new_slave); bond_mc_update(bond, new_slave, NULL); } @@ -1421,6 +1456,17 @@ static int bond_enslave(struct net_devic } bond_3ad_bind_slave(new_slave); + } else if (bond_mode == BOND_MODE_TLB) { + new_slave->state = BOND_STATE_ACTIVE; + if ((bond->current_slave == NULL) && (new_slave->link == BOND_LINK_UP)) { + /* + * first slave or no active slave yet, and this link + * is OK, so make this interface the active one + */ + bond_assign_current_slave(bond, new_slave); + } + + bond_alb_init_slave(bond, new_slave); } else { #ifdef BONDING_DEBUG printk(KERN_CRIT "This slave is always active in trunk mode\n"); @@ -1428,7 +1474,7 @@ static int bond_enslave(struct net_devic /* always active in trunk mode */ new_slave->state = BOND_STATE_ACTIVE; if (bond->current_slave == NULL) - bond->current_slave = new_slave; + bond_assign_current_slave(bond, new_slave); } write_unlock_irqrestore(&bond->lock, flags); @@ -1499,7 +1545,7 @@ static int bond_change_active(struct net bond_set_slave_inactive_flags(oldactive); bond_set_slave_active_flags(newactive); bond_mc_update(bond, newactive, oldactive); - bond->current_slave = newactive; + bond_assign_current_slave(bond, newactive); printk("%s : activate %s(old : %s)\n", master_dev->name, newactive->dev->name, oldactive->dev->name); @@ -1536,14 +1582,14 @@ slave_t *change_active_interface(bonding if (newslave == NULL) { /* there were no active slaves left */ if (bond->next != (slave_t *)bond) { /* found one slave */ write_lock(&bond->ptrlock); - newslave = bond->current_slave = bond->next; + newslave = bond_assign_current_slave(bond, bond->next); write_unlock(&bond->ptrlock); } else { printk (" but could not find any %s interface.\n", (bond_mode == BOND_MODE_ACTIVEBACKUP) ? "backup":"other"); write_lock(&bond->ptrlock); - bond->current_slave = (slave_t *)NULL; + bond_assign_current_slave(bond, NULL); write_unlock(&bond->ptrlock); return NULL; /* still no slave, return NULL */ } @@ -1588,7 +1634,7 @@ slave_t *change_active_interface(bonding } write_lock(&bond->ptrlock); - bond->current_slave = newslave; + bond_assign_current_slave(bond, newslave); write_unlock(&bond->ptrlock); return newslave; } @@ -1618,7 +1664,7 @@ slave_t *change_active_interface(bonding bond_set_slave_active_flags(bestslave); bond_mc_update(bond, bestslave, oldslave); write_lock(&bond->ptrlock); - bond->current_slave = bestslave; + bond_assign_current_slave(bond, bestslave); write_unlock(&bond->ptrlock); return bestslave; } @@ -1643,7 +1689,7 @@ slave_t *change_active_interface(bonding /* absolutely nothing found. let's return NULL */ write_lock(&bond->ptrlock); - bond->current_slave = (slave_t *)NULL; + bond_assign_current_slave(bond, NULL); write_unlock(&bond->ptrlock); return NULL; } @@ -1685,8 +1731,30 @@ static int bond_release(struct net_devic old_current = bond->current_slave; while ((our_slave = our_slave->prev) != (slave_t *)bond) { if (our_slave->dev == slave) { + int mac_addr_differ = memcmp(bond->device->dev_addr, + our_slave->perm_hwaddr, + ETH_ALEN); + if (!mac_addr_differ && (bond->slave_cnt > 1)) { + printk(KERN_WARNING "WARNING: the permanent HWaddr of %s " + "- %02X:%02X:%02X:%02X:%02X:%02X - " + "is still in use by %s. Set the HWaddr " + "of %s to a different address " + "to avoid conflicts.\n", + slave->name, + slave->dev_addr[0], + slave->dev_addr[1], + slave->dev_addr[2], + slave->dev_addr[3], + slave->dev_addr[4], + slave->dev_addr[5], + bond->device->name, + slave->name); + } + /* Inform AD package of unbinding of slave. */ if (bond_mode == BOND_MODE_8023AD) { + //must be called before the slave is + //detached from the list bond_3ad_unbind_slave(our_slave); } @@ -1715,6 +1783,13 @@ static int bond_release(struct net_devic bond->primary_slave = NULL; } + if (bond_mode == BOND_MODE_TLB) { + /* must be called only after the slave has been + * detached from the list + */ + bond_alb_deinit_slave(bond, our_slave); + } + break; } @@ -1791,7 +1866,7 @@ static int bond_release_all(struct net_d bond = (struct bonding *) master->priv; bond->current_arp_slave = NULL; - bond->current_slave = NULL; + bond_assign_current_slave(bond, NULL); bond->primary_slave = NULL; while ((our_slave = bond->prev) != (slave_t *)bond) { @@ -1804,6 +1879,10 @@ static int bond_release_all(struct net_d slave_dev = our_slave->dev; bond_detach_slave(bond, our_slave); + if (bond_mode == BOND_MODE_TLB) { + bond_alb_deinit_slave(bond, our_slave); + } + if (multicast_mode == BOND_MULTICAST_ALL || (multicast_mode == BOND_MULTICAST_ACTIVE && bond->current_slave == our_slave)) { @@ -1944,6 +2023,10 @@ static void bond_mii_monitor(struct net_ bond_3ad_link_status_changed(slave, 0); } + if (bond_mode == BOND_MODE_TLB) { + bond_alb_handle_link_change(bond, slave, BOND_LINK_FAIL); + } + read_lock(&bond->ptrlock); if (slave == bond->current_slave) { read_unlock(&bond->ptrlock); @@ -2037,7 +2120,11 @@ static void bond_mii_monitor(struct net_ if (bond_mode == BOND_MODE_8023AD) { bond_3ad_link_status_changed(slave, 1); } - + + if (bond_mode == BOND_MODE_TLB) { + bond_alb_handle_link_change(bond, slave, BOND_LINK_UP); + } + if ( (bond->primary_slave != NULL) && (slave == bond->primary_slave) ) change_active_interface(bond); @@ -2104,6 +2191,10 @@ static void bond_mii_monitor(struct net_ if (bond_mode == BOND_MODE_8023AD) { bond_3ad_link_status_changed(bestslave, 1); } + + if (bond_mode == BOND_MODE_TLB) { + bond_alb_handle_link_change(bond, bestslave, BOND_LINK_UP); + } } if (bond_mode == BOND_MODE_ACTIVEBACKUP) { @@ -2113,7 +2204,7 @@ static void bond_mii_monitor(struct net_ bestslave->state = BOND_STATE_ACTIVE; } write_lock(&bond->ptrlock); - bond->current_slave = bestslave; + bond_assign_current_slave(bond, bestslave); write_unlock(&bond->ptrlock); } else if (slave_died) { /* print this message only once a slave has just died */ @@ -2315,7 +2406,7 @@ static void activebackup_arp_monitor(str if ((bond->current_slave == NULL) && ((jiffies - slave->dev->trans_start) <= the_delta_in_ticks)) { - bond->current_slave = slave; + bond_assign_current_slave(bond, slave); bond_set_slave_active_flags(slave); bond_mc_update(bond, slave, NULL); bond->current_arp_slave = NULL; @@ -2427,7 +2518,7 @@ static void activebackup_arp_monitor(str bond_set_slave_inactive_flags(slave); bond_mc_update(bond, bond->primary_slave, slave); write_lock(&bond->ptrlock); - bond->current_slave = bond->primary_slave; + bond_assign_current_slave(bond, bond->primary_slave); write_unlock(&bond->ptrlock); slave = bond->primary_slave; bond_set_slave_active_flags(slave); @@ -2849,7 +2940,7 @@ static int bond_xmit_roundrobin(struct s dev_queue_xmit(skb); write_lock(&bond->ptrlock); - bond->current_slave = slave->next; + bond_assign_current_slave(bond, slave->next); write_unlock(&bond->ptrlock); read_unlock_irqrestore(&bond->lock, flags); @@ -3236,7 +3327,7 @@ static int __init bond_init(struct net_d memset(bond->stats, 0, sizeof(struct net_device_stats)); bond->next = bond->prev = (slave_t *)bond; - bond->current_slave = NULL; + bond_assign_current_slave(bond, NULL); bond->current_arp_slave = NULL; bond->device = dev; dev->priv = bond; @@ -3258,6 +3349,9 @@ static int __init bond_init(struct net_d case BOND_MODE_8023AD: dev->hard_start_xmit = bond_3ad_xmit_xor; break; + case BOND_MODE_TLB: + dev->hard_start_xmit = bond_alb_xmit; + break; default: printk(KERN_ERR "Unknown bonding mode %d\n", bond_mode); kfree(bond->stats); @@ -3450,8 +3544,7 @@ static int __init bonding_init(void) if (arp_interval != 0) { printk(KERN_WARNING "bonding_init(): ARP monitoring" "can't be used simultaneously with 802.3ad, " - "disabling ARP monitoring\n" - ); + "disabling ARP monitoring\n"); arp_interval = 0; } @@ -3460,20 +3553,41 @@ static int __init bonding_init(void) "bonding_init(): miimon must be specified, " "otherwise bonding will not detect link failure, " "speed and duplex which are essential " - "for 802.3ad operation" - "Forcing miimon to 100msec\n"); + "for 802.3ad operation\n"); + printk(KERN_ERR "Forcing miimon to 100msec\n"); miimon = 100; } if (multicast_mode != BOND_MULTICAST_ALL) { printk(KERN_ERR "bonding_init(): Multicast mode must " - "be set to ALL for 802.3ad, " - "Forcing Multicast mode to ALL\n"); + "be set to ALL for 802.3ad\n"); + printk(KERN_ERR "Forcing Multicast mode to ALL\n"); multicast_mode = BOND_MULTICAST_ALL; } } - + + /* reset values for TLB */ + if (bond_mode == BOND_MODE_TLB) { + if (miimon == 0) { + printk(KERN_ERR + "bonding_init(): miimon must be specified, " + "otherwise bonding will not detect link failure " + "and link speed which are essential " + "for TLB load balancing\n"); + printk(KERN_ERR "Forcing miimon to 100msec\n"); + miimon = 100; + } + + if (multicast_mode != BOND_MULTICAST_ACTIVE) { + printk(KERN_ERR + "bonding_init(): Multicast mode must " + "be set to ACTIVE for TLB\n"); + printk(KERN_ERR "Forcing Multicast mode to ACTIVE\n"); + multicast_mode = BOND_MULTICAST_ACTIVE; + } + } + if (miimon == 0) { if ((updelay != 0) || (downdelay != 0)) { /* just warn the user the up/down delay will have diff -Nuarp linux-2.4.20-bonding-20030317/drivers/net/bonding/Makefile linux-2.4.20-bonding-20030317-devel/drivers/net/bonding/Makefile --- linux-2.4.20-bonding-20030317/drivers/net/bonding/Makefile 2003-03-27 17:00:05.000000000 +0200 +++ linux-2.4.20-bonding-20030317-devel/drivers/net/bonding/Makefile 2003-03-27 17:00:06.000000000 +0200 @@ -5,7 +5,8 @@ O_TARGET := bonding.o obj-y := bond_main.o \ - bond_3ad.o + bond_3ad.o \ + bond_alb.o obj-m := $(O_TARGET) diff -Nuarp linux-2.4.20-bonding-20030317/include/linux/if_bonding.h linux-2.4.20-bonding-20030317-devel/include/linux/if_bonding.h --- linux-2.4.20-bonding-20030317/include/linux/if_bonding.h 2003-03-27 17:00:05.000000000 +0200 +++ linux-2.4.20-bonding-20030317-devel/include/linux/if_bonding.h 2003-03-27 17:00:06.000000000 +0200 @@ -55,6 +55,7 @@ #define BOND_MODE_XOR 2 #define BOND_MODE_BROADCAST 3 #define BOND_MODE_8023AD 4 +#define BOND_MODE_TLB 5 /* each slave's link has 4 states */ #define BOND_LINK_UP 0 /* link is up and running */ -- | Shmulik Hen | | Israel Design Center (Jerusalem) | | LAN Access Division | | Intel Communications Group, Intel corp. | From davem@redhat.com Thu Mar 27 08:10:45 2003 Received: with ECARTIS (v1.0.0; list netdev); Thu, 27 Mar 2003 08:11:18 -0800 (PST) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h2RGA4q9026968 for ; Thu, 27 Mar 2003 08:10:45 -0800 Received: from localhost (IDENT:davem@localhost.localdomain [127.0.0.1]) by pizda.ninka.net (8.9.3/8.9.3) with ESMTP id IAA03974; Thu, 27 Mar 2003 08:06:28 -0800 Date: Thu, 27 Mar 2003 08:06:27 -0800 (PST) Message-Id: <20030327.080627.71980411.davem@redhat.com> To: shmulik.hen@intel.com Cc: bonding-devel@lists.sourceforge.net, bonding-announce@lists.sourceforge.net, linux-net@vger.kernel.org, netdev@oss.sgi.com, linux-kernel@vger.kernel.org, jgarzik@pobox.com Subject: Re: [Bonding][patch] Adding Transmit load balancing mode to bonding From: "David S. Miller" In-Reply-To: References: X-FalunGong: Information control. X-Mailer: Mew version 2.1 on Emacs 21.1 / Mule 5.0 (SAKAKI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) X-archive-position: 2065 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev Content-Length: 300 Lines: 8 From: shmulik.hen@intel.com Date: Thu, 27 Mar 2003 17:38:02 +0200 (IST) Balancing is connection oriented (e.g. by IPv4 destination address) so packet order is always kept. You could also key off of the destination/source port as well for UDP/TCP/SCTP. Have you experimented with this? From Robert.Olsson@data.slu.se Thu Mar 27 08:56:07 2003 Received: with ECARTIS (v1.0.0; list netdev); Thu, 27 Mar 2003 08:56:12 -0800 (PST) Received: from robur.slu.se (robur.slu.se [130.238.98.12]) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h2RGtPq9027879 for ; Thu, 27 Mar 2003 08:56:06 -0800 Received: (from robert@localhost) by robur.slu.se (8.9.3/8.9.3) id RAA28881; Thu, 27 Mar 2003 17:54:18 +0100 From: Robert Olsson MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Message-ID: <16003.11449.497905.815776@robur.slu.se> Date: Thu, 27 Mar 2003 17:54:17 +0100 To: "Feldman, Scott" Cc: Robert Olsson , Jeff Garzik , netdev@oss.sgi.com Subject: RE: [Fwd: [E1000] NAPI re-insertion w/ changes] X-Mailer: VM 6.92 under Emacs 19.34.1 X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) X-archive-position: 2066 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: Robert.Olsson@data.slu.se Precedence: bulk X-list: netdev Content-Length: 3143 Lines: 98 Feldman, Scott writes: > Easy enough to revert back. I don't think we've lost any of the > non-perf benefits of NAPI, and if testing shows no meaningful perf > difference, let's let Occam's razor rule. Hello! Here is a hi-perf routing/DoS to start with... 10 Million pkts injected at high speed into eth2 and forwarded to eth3. Rx and Tx buffers are 256 and HW_FLOW is disabled and RxIntDelay=1. Which is same parameters as we use for production systems. As seen link now flaps. Eventually can hw_flowcontrol and interrupt delays help this... but thats not an option at least not for us. Twist: New Old ==================================== Input rate: 680 (due to link drop) 820 kpps T-put: 309 385 kpps RX irq's: 78963 434 Cheers. --ro e1000 2.5.66 NAPI (680 kpps in) ------------------------------- NETDEV WATCHDOG: eth2: transmit timed out e1000: eth2 NIC Link is Up 1000 Mbps Full Duplex Iface MTU Met RX-OK RX-ERR RX-DRP RX-OVR TX-OK TX-ERR TX-DRP TX-OVR Flags eth2 1500 0 4175392 7277442 7277442 4696985 18 0 0 0 BRU eth3 1500 0 1 0 0 0 4555009 0 0 0 BRU 26: 78963 IO-APIC-level eth2 27: 92979 IO-APIC-level eth3 e1000 2.5.66 NAPI with patch (820 kpps in) ------------------------------------------ Iface MTU Met RX-OK RX-ERR RX-DRP RX-OVR TX-OK TX-ERR TX-DRP TX-OVR Flags eth2 1500 0 4708196 8173469 8173469 5291805 18 0 0 0 BRU eth3 1500 0 1 0 0 0 4708135 0 0 0 BRU 26: 434 IO-APIC-level eth2 27: 74584 IO-APIC-level eth3 --- e1000_main.c.orig 2003-03-27 14:38:02.000000000 +0100 +++ e1000_main.c 2003-03-27 16:43:38.000000000 +0100 @@ -2000,9 +2000,12 @@ } #ifdef CONFIG_E1000_NAPI - /* Don't disable interrupts - rely on h/w interrupt - * moderation to keep interrupts low. netif_rx_schedule - * is NOP if already polling. */ + /* Disable interrupts and register for poll. The flush + of the posted write is intentionally left out. + */ + + atomic_inc(&adapter->irq_sem); + E1000_WRITE_REG(&adapter->hw, IMC, ~0); netif_rx_schedule(netdev); #else for(i = 0; i < E1000_MAX_INTR; i++) @@ -2024,18 +2027,17 @@ struct e1000_adapter *adapter = netdev->priv; int work_to_do = min(*budget, netdev->quota); int work_done = 0; - - while(work_done < work_to_do) - if(!e1000_clean_rx_irq(adapter, &work_done, work_to_do) && - !e1000_clean_tx_irq(adapter)) - break; + + e1000_clean_tx_irq(adapter); + e1000_clean_rx_irq(adapter, &work_done, work_to_do); *budget -= work_done; netdev->quota -= work_done; - if(work_done < work_to_do) + if(work_done < work_to_do) { netif_rx_complete(netdev); - + e1000_irq_enable(adapter); + } return (work_done >= work_to_do); } #endif From torvalds@transmeta.com Thu Mar 27 09:24:37 2003 Received: with ECARTIS (v1.0.0; list netdev); Thu, 27 Mar 2003 09:24:42 -0800 (PST) Received: from neon-gw.transmeta.com (neon-gw-l3.transmeta.com [63.209.4.196]) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h2RHOaq9028475 for ; Thu, 27 Mar 2003 09:24:37 -0800 Received: (from root@localhost) by neon-gw.transmeta.com (8.9.3/8.9.3) id JAA13760; Thu, 27 Mar 2003 09:23:42 -0800 Received: from mailhost.transmeta.com(10.1.1.15) by neon-gw.transmeta.com via smap (V2.1) id xma013747; Thu, 27 Mar 03 09:23:17 -0800 Received: from home.transmeta.com (torvalds-home.transmeta.com [10.64.7.194]) by deepthought.transmeta.com (8.11.6/8.11.6) with ESMTP id h2RHNIa12184; Thu, 27 Mar 2003 09:23:19 -0800 (PST) Date: Thu, 27 Mar 2003 09:22:29 -0800 (PST) From: Linus Torvalds To: "David S. Miller" cc: shmulik.hen@intel.com, , , , , , , , Subject: Re: BUG or not? GFP_KERNEL with interrupts disabled. In-Reply-To: <20030327.054357.17283294.davem@redhat.com> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) X-archive-position: 2067 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: torvalds@transmeta.com Precedence: bulk X-list: netdev Content-Length: 1450 Lines: 56 On Thu, 27 Mar 2003, David S. Miller wrote: > > Further more, holding a lock_irq doesn't mean bottom halves are disabled > too, it just means interrupts are disabled and no *new* softirq can be > queued. Consider the following situation: > > I think local_bh_enable() should check irqs_disabled() and honour that. > What you are showing here, that BH's can run via local_bh_enable() > even when IRQs are disabled, is a BUG(). I'd disagree. I do agree that we should obviously not run bottom halves with interrupts disabled, but I think the _real_ bug is doing "local_bh_enable()" in the first place. It's a nesting bug: you must nest the "stronger" lock inside the weaker one, which means that the following is right: local_bh_disable() .. local_irq_disable() ... local_irq_enable() .. local_bh_enable() and this is WRONG: local_irq_disable() (or spinlock) .. local_bh_disable() .. local_bh_enable() !BUG BUG BUG! .. local_irq_enable() So the bug is, in my opinion, not in BK handling, but in the caller. I missed the start of this thread, so I don't know how hard this is to fix. But if you have a buggy sequence, the _simple_ fix may be to do somehting like this: +++ local_bh_disable() local_irq_disable() (or spinlock) .. local_bh_disable() .. local_bh_enable() ! now it's a no-op and no longer a bug .. local_irq_enable() +++ local_bh_enable() What's the code sequence? Linus From redacaocoml@bol.com.br Thu Mar 27 09:45:28 2003 Received: with ECARTIS (v1.0.0; list netdev); Thu, 27 Mar 2003 09:45:31 -0800 (PST) Received: from bol.com.br (200-163-044-206.cpece7003.dsl.brasiltelecom.net.br [200.163.44.206]) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h2RHieq9029778 for ; Thu, 27 Mar 2003 09:45:24 -0800 Message-Id: <200303271745.h2RHieq9029778@oss.sgi.com> From: "Redação Comercial" To: Subject: 300 Modelos de Cartas comerciais, avisos, convites, propostas, etc. Mime-Version: 1.0 Content-Type: text/plain; charset="ISO-8859-1" Date: Thu, 27 Mar 2003 13:45:19 -0400 Content-Transfer-Encoding: 8bit X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) X-archive-position: 2068 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: redacaocoml@bol.com.br Precedence: bulk X-list: netdev Content-Length: 862 Lines: 22 COMUNICADO IMPORTANTE!! Estamos lançando o KIT DE CARTAS COMERCIAIS, que sana suas dúvidas na elaboração de: agradecimentos, atestados e declarações, avisos, cartas de cobrança, cartas em inglês, comunicados, convites, contratos, propostas, empregos, solicitações e pedidos, telegramas, cartas por e-mail, etc. Composto de 02 (dois) disquetes com 150 modelos de documentos cada um, mais livreto 20 páginas, com técnicas de redação comercial. Indicado para: secretárias em geral, gerências, Rh, executivos, estudantes e empresas de toda ordem. Este kit possui um preço ínfimo em relação ao que poderá gerar no aperfeiçoamento da comunicação de sua empresa. Acesse nossa Home Page para mais detalhes: http://www.redacaodecartas.ihp.com.br Ps: Caso não queira receber novas mensagens e novidades sobre esse assunto, acesse: http://www.remova-me.ihp.com.br From davem@redhat.com Thu Mar 27 09:59:40 2003 Received: with ECARTIS (v1.0.0; list netdev); Thu, 27 Mar 2003 09:59:45 -0800 (PST) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h2RHxYq9030207 for ; Thu, 27 Mar 2003 09:59:40 -0800 Received: from localhost (IDENT:davem@localhost.localdomain [127.0.0.1]) by pizda.ninka.net (8.9.3/8.9.3) with ESMTP id JAA04319; Thu, 27 Mar 2003 09:55:38 -0800 Date: Thu, 27 Mar 2003 09:55:37 -0800 (PST) Message-Id: <20030327.095537.26269606.davem@redhat.com> To: torvalds@transmeta.com Cc: shmulik.hen@intel.com, dane@aiinet.com, bonding-devel@lists.sourceforge.net, bonding-announce@lists.sourceforge.net, netdev@oss.sgi.com, linux-kernel@vger.kernel.org, linux-net@vger.kernel.org, mingo@redhat.com, kuznet@ms2.inr.ac.ru Subject: Re: BUG or not? GFP_KERNEL with interrupts disabled. From: "David S. Miller" In-Reply-To: References: <20030327.054357.17283294.davem@redhat.com> X-FalunGong: Information control. X-Mailer: Mew version 2.1 on Emacs 21.1 / Mule 5.0 (SAKAKI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) X-archive-position: 2069 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev Content-Length: 314 Lines: 14 From: Linus Torvalds Date: Thu, 27 Mar 2003 09:22:29 -0800 (PST) I do agree that we should obviously not run bottom halves with interrupts disabled Ok, so can we add a: if (irqs_disabled()) BUG(); check to do_softirq()? I'll address the rest of your email in a bit. From torvalds@transmeta.com Thu Mar 27 10:07:22 2003 Received: with ECARTIS (v1.0.0; list netdev); Thu, 27 Mar 2003 10:07:27 -0800 (PST) Received: from neon-gw.transmeta.com (neon-gw-l3.transmeta.com [63.209.4.196]) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h2RI6gq9030656 for ; Thu, 27 Mar 2003 10:07:22 -0800 Received: (from root@localhost) by neon-gw.transmeta.com (8.9.3/8.9.3) id KAA16454; Thu, 27 Mar 2003 10:06:06 -0800 Received: from mailhost.transmeta.com(10.1.1.15) by neon-gw.transmeta.com via smap (V2.1) id xma016440; Thu, 27 Mar 03 10:05:39 -0800 Received: from home.transmeta.com (torvalds-home.transmeta.com [10.64.7.194]) by deepthought.transmeta.com (8.11.6/8.11.6) with ESMTP id h2RI5ga15986; Thu, 27 Mar 2003 10:05:42 -0800 (PST) Date: Thu, 27 Mar 2003 10:04:52 -0800 (PST) From: Linus Torvalds To: "David S. Miller" cc: shmulik.hen@intel.com, , , , , , , , Subject: Re: BUG or not? GFP_KERNEL with interrupts disabled. In-Reply-To: <20030327.095537.26269606.davem@redhat.com> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) X-archive-position: 2070 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: torvalds@transmeta.com Precedence: bulk X-list: netdev Content-Length: 530 Lines: 21 On Thu, 27 Mar 2003, David S. Miller wrote: > > I do agree that we should obviously not run bottom halves with > interrupts disabled > > Ok, so can we add a: > > if (irqs_disabled()) > BUG(); > > check to do_softirq()? I'd suggest making it a counting warning (with a static counter per local-bh-enable macro expansion) and adding it to local_bh_enable() - otherwise it will only BUG() when the (potentially rare) condition happens - instead of always giving a nice backtrace of exact problem spots. Linus From davem@redhat.com Thu Mar 27 10:12:19 2003 Received: with ECARTIS (v1.0.0; list netdev); Thu, 27 Mar 2003 10:12:21 -0800 (PST) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h2RIBcq9030995 for ; Thu, 27 Mar 2003 10:12:19 -0800 Received: from localhost (IDENT:davem@localhost.localdomain [127.0.0.1]) by pizda.ninka.net (8.9.3/8.9.3) with ESMTP id KAA04389; Thu, 27 Mar 2003 10:07:01 -0800 Date: Thu, 27 Mar 2003 10:07:00 -0800 (PST) Message-Id: <20030327.100700.23575723.davem@redhat.com> To: torvalds@transmeta.com Cc: shmulik.hen@intel.com, dane@aiinet.com, bonding-devel@lists.sourceforge.net, bonding-announce@lists.sourceforge.net, netdev@oss.sgi.com, linux-kernel@vger.kernel.org, linux-net@vger.kernel.org, mingo@redhat.com, kuznet@ms2.inr.ac.ru Subject: Re: BUG or not? GFP_KERNEL with interrupts disabled. From: "David S. Miller" In-Reply-To: References: <20030327.095537.26269606.davem@redhat.com> X-FalunGong: Information control. X-Mailer: Mew version 2.1 on Emacs 21.1 / Mule 5.0 (SAKAKI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) X-archive-position: 2071 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev Content-Length: 515 Lines: 11 From: Linus Torvalds Date: Thu, 27 Mar 2003 10:04:52 -0800 (PST) I'd suggest making it a counting warning (with a static counter per local-bh-enable macro expansion) and adding it to local_bh_enable() - otherwise it will only BUG() when the (potentially rare) condition happens - instead of always giving a nice backtrace of exact problem spots. Ok, maybe it's time to move local_bh_enable() out of line, it's getting large and it's expanded in hundreds of places. From trondmy@charged.uio.no Thu Mar 27 10:24:15 2003 Received: with ECARTIS (v1.0.0; list netdev); Thu, 27 Mar 2003 10:24:19 -0800 (PST) Received: from mons.uio.no (IDENT:7411@mons.uio.no [129.240.130.14]) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h2RINYq9032014 for ; Thu, 27 Mar 2003 10:24:14 -0800 Received: from charged.uio.no ([129.240.86.49]) by mons.uio.no with esmtp (Exim 2.12 #7) id 18yY6o-0002GS-00; Thu, 27 Mar 2003 15:11:58 +0100 Received: from trondmy by charged.uio.no with local (Exim 2.12 #1) id 18yY6m-0003fm-00; Thu, 27 Mar 2003 15:11:56 +0100 To: "David S. Miller" Cc: shmulik.hen@intel.com, dane@aiinet.com, bonding-devel@lists.sourceforge.net, bonding-announce@lists.sourceforge.net, netdev@oss.sgi.com, linux-kernel@vger.kernel.org, linux-net@vger.kernel.org, torvalds@transmeta.com, mingo@redhat.com, kuznet@ms2.inr.ac.ru Subject: Re: BUG or not? GFP_KERNEL with interrupts disabled. References: <20030327.054357.17283294.davem@redhat.com> From: Trond Myklebust Date: 27 Mar 2003 15:11:56 +0100 In-Reply-To: <20030327.054357.17283294.davem@redhat.com> Message-ID: User-Agent: Gnus/5.0808 (Gnus v5.8.8) XEmacs/21.4 (Honest Recruiter) MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) X-archive-position: 2072 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: trond.myklebust@fys.uio.no Precedence: bulk X-list: netdev Content-Length: 826 Lines: 22 >>>>> " " == David S Miller writes: > From: shmulik.hen@intel.com Date: Thu, 27 Mar 2003 15:32:02 > +0200 (IST) > Further more, holding a lock_irq doesn't mean bottom halves > are disabled too, it just means interrupts are disabled and > no *new* softirq can be queued. Consider the following > situation: > I think local_bh_enable() should check irqs_disabled() and > honour that. What you are showing here, that BH's can run via > local_bh_enable() even when IRQs are disabled, is a BUG(). > IRQ disabling is meant to be stronger than softint disabling. In that case, you'll need to have things like spin_lock_irqrestore() call local_bh_enable() in order to run the pending softirqs. Is that worth the trouble? Cheers, Trond From linux-netdev@gmane.org Thu Mar 27 10:29:33 2003 Received: with ECARTIS (v1.0.0; list netdev); Thu, 27 Mar 2003 10:29:36 -0800 (PST) Received: from main.gmane.org (main.gmane.org [80.91.224.249]) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h2RITVq9032686 for ; Thu, 27 Mar 2003 10:29:32 -0800 Received: from list by main.gmane.org with local (Exim 3.35 #1 (Debian)) id 18yau3-0003RT-00 for ; Thu, 27 Mar 2003 18:10:59 +0100 X-Injected-Via-Gmane: http://gmane.org/ To: netdev@oss.sgi.com Received: from news by main.gmane.org with local (Exim 3.35 #1 (Debian)) id 18yat9-0003PA-00 for ; Thu, 27 Mar 2003 18:10:03 +0100 From: Jason Lunz Subject: Re: [Fwd: [E1000] NAPI re-insertion w/ changes] Date: Thu, 27 Mar 2003 17:10:03 +0000 (UTC) Organization: PBR Streetgang Message-ID: References: <16003.11449.497905.815776@robur.slu.se> X-Complaints-To: usenet@main.gmane.org User-Agent: slrn/0.9.7.4 (Linux) X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) X-archive-position: 2073 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: lunz@falooley.org Precedence: bulk X-list: netdev Content-Length: 1032 Lines: 26 Robert.Olsson@data.slu.se said: > 10 Million pkts injected at high speed into eth2 and forwarded to eth3. Rx and > Tx buffers are 256 and HW_FLOW is disabled and RxIntDelay=1. Which is same > parameters as we use for production systems. As seen link now flaps. > Eventually can hw_flowcontrol and interrupt delays help this... but thats not > an option at least not for us. > > > Twist: New Old > ==================================== > Input rate: 680 (due to link drop) 820 kpps > T-put: 309 385 kpps > RX irq's: 78963 434 I've seen pretty much the same thing. I plotted throughput vs. offered load for e1000 4.4.12-k1, 4.4.19-k3, and 5.0.43-k1 (all backported to 2.4.20). A summary with graphs is at: http://gtf.org/lunz/linux/net/perf/ 5.0.43 seems to be a significant regression, both in terms of throughput and CPU load. -- Jason Lunz Reflex Security lunz@reflexsecurity.com http://www.reflexsecurity.com/ From dane@aiinet.com Thu Mar 27 11:02:29 2003 Received: with ECARTIS (v1.0.0; list netdev); Thu, 27 Mar 2003 11:02:37 -0800 (PST) Received: from aimail.aiinet.com ([205.245.180.30]) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h2RJ2Rq9001252 for ; Thu, 27 Mar 2003 11:02:29 -0800 Received: from dane-linux.aiinet.com ([10.39.3.117]) by aimail.aiinet.com with SMTP (Microsoft Exchange Internet Mail Service Version 5.5.2653.13) id FXZAHRB3; Thu, 27 Mar 2003 14:02:16 -0500 Date: Thu, 27 Mar 2003 14:02:20 -0500 (EST) From: Dan Eble Reply-To: To: Linus Torvalds cc: "David S. Miller" , , , , , , , , Subject: Re: BUG or not? GFP_KERNEL with interrupts disabled. In-Reply-To: <3B785392832ED71192AE00D0B7B0D75B539668@aimail.aiinet.com> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) X-archive-position: 2074 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: dane@aiinet.com Precedence: bulk X-list: netdev Content-Length: 1542 Lines: 42 On Thu, 27 Mar 2003, Linus Torvalds wrote: > > On Thu, 27 Mar 2003, David S. Miller wrote: > > > > Ok, so can we add a: > > > > if (irqs_disabled()) > > BUG(); > > > > check to do_softirq()? > > I'd suggest making it a counting warning (with a static counter per > local-bh-enable macro expansion) and adding it to local_bh_enable() - > otherwise it will only BUG() when the (potentially rare) condition > happens - instead of always giving a nice backtrace of exact problem > spots. So, to return to my original question... local_bh_count() > 0 when a BH is running or after local_bh_disable(). local_irq_count() > 0 in interrupt context, but not necessarily when interrupts are disabled. This makes checks like the following (in alloc_skb) asymmetric: if (in_interrupt() && (gfp_mask & __GFP_WAIT)) { static int count = 0; if (++count < 5) { printk(KERN_ERR "alloc_skb called nonatomically " "from interrupt %p\n", NET_CALLER(size)); BUG(); In a driver I'm writing, this bug was hidden until I switched from using write_lock_irqsave() to write_lock_bh(). Shouldn't this bug also be announced if interrupts are disabled? (I understand that disabling bh/irq in the correct order will ensure that this bug is properly detected, but it seems like a strange policy to rely on correct coding to catch a bug.) -- Dan Eble _____ . | _ |/| Applied Innovation Inc. | |_| | | http://www.aiinet.com/ |__/|_|_| From torvalds@transmeta.com Thu Mar 27 11:10:44 2003 Received: with ECARTIS (v1.0.0; list netdev); Thu, 27 Mar 2003 11:11:17 -0800 (PST) Received: from neon-gw.transmeta.com (neon-gw-l3.transmeta.com [63.209.4.196]) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h2RJA4q9001653 for ; Thu, 27 Mar 2003 11:10:44 -0800 Received: (from root@localhost) by neon-gw.transmeta.com (8.9.3/8.9.3) id LAA20216; Thu, 27 Mar 2003 11:09:47 -0800 Received: from mailhost.transmeta.com(10.1.1.15) by neon-gw.transmeta.com via smap (V2.1) id xma020180; Thu, 27 Mar 03 11:09:13 -0800 Received: from home.transmeta.com (torvalds-home.transmeta.com [10.64.7.194]) by deepthought.transmeta.com (8.11.6/8.11.6) with ESMTP id h2RJ9Ga21046; Thu, 27 Mar 2003 11:09:16 -0800 (PST) Date: Thu, 27 Mar 2003 11:08:26 -0800 (PST) From: Linus Torvalds To: Dan Eble cc: "David S. Miller" , , , , , , , , Subject: Re: BUG or not? GFP_KERNEL with interrupts disabled. In-Reply-To: Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) X-archive-position: 2075 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: torvalds@transmeta.com Precedence: bulk X-list: netdev Content-Length: 936 Lines: 33 On Thu, 27 Mar 2003, Dan Eble wrote: > > This makes checks like the following (in alloc_skb) asymmetric: > > if (in_interrupt() && (gfp_mask & __GFP_WAIT)) { > static int count = 0; > if (++count < 5) { > printk(KERN_ERR "alloc_skb called nonatomically " > "from interrupt %p\n", NET_CALLER(size)); > BUG(); > > In a driver I'm writing, this bug was hidden until I switched from using > write_lock_irqsave() to write_lock_bh(). Shouldn't this bug also be > announced if interrupts are disabled? Yeah. It should also probably use "in_atomic()" instead of "in_interrupt()", since that also finds people who have marked themselves non-preemptible. So what the test SHOULD look like is this: if (gfp_mask & __GFP_WAIT) { if (in_atomic() || irqs_disabled()) { static int count = 0; ... } } which should catch all the cases we really care about. Linus From davem@redhat.com Thu Mar 27 11:14:39 2003 Received: with ECARTIS (v1.0.0; list netdev); Thu, 27 Mar 2003 11:14:41 -0800 (PST) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h2RJDwq9001988 for ; Thu, 27 Mar 2003 11:14:39 -0800 Received: from localhost (IDENT:davem@localhost.localdomain [127.0.0.1]) by pizda.ninka.net (8.9.3/8.9.3) with ESMTP id LAA04535; Thu, 27 Mar 2003 11:10:12 -0800 Date: Thu, 27 Mar 2003 11:10:12 -0800 (PST) Message-Id: <20030327.111012.23672715.davem@redhat.com> To: torvalds@transmeta.com Cc: dane@aiinet.com, shmulik.hen@intel.com, bonding-devel@lists.sourceforge.net, bonding-announce@lists.sourceforge.net, netdev@oss.sgi.com, linux-kernel@vger.kernel.org, linux-net@vger.kernel.org, mingo@redhat.com, kuznet@ms2.inr.ac.ru Subject: Re: BUG or not? GFP_KERNEL with interrupts disabled. From: "David S. Miller" In-Reply-To: References: X-FalunGong: Information control. X-Mailer: Mew version 2.1 on Emacs 21.1 / Mule 5.0 (SAKAKI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) X-archive-position: 2076 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev Content-Length: 454 Lines: 16 From: Linus Torvalds Date: Thu, 27 Mar 2003 11:08:26 -0800 (PST) So what the test SHOULD look like is this: if (gfp_mask & __GFP_WAIT) { if (in_atomic() || irqs_disabled()) { static int count = 0; ... } } which should catch all the cases we really care about. Let's codify this "in_atomic() || irqs_disabled()" test into a macro that everyone can use to test sleepability, ok? From torvalds@transmeta.com Thu Mar 27 11:25:14 2003 Received: with ECARTIS (v1.0.0; list netdev); Thu, 27 Mar 2003 11:25:17 -0800 (PST) Received: from neon-gw.transmeta.com (neon-gw-l3.transmeta.com [63.209.4.196]) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h2RJOXq9002731 for ; Thu, 27 Mar 2003 11:25:14 -0800 Received: (from root@localhost) by neon-gw.transmeta.com (8.9.3/8.9.3) id LAA21181; Thu, 27 Mar 2003 11:24:16 -0800 Received: from mailhost.transmeta.com(10.1.1.15) by neon-gw.transmeta.com via smap (V2.1) id xma021137; Thu, 27 Mar 03 11:23:42 -0800 Received: from home.transmeta.com (torvalds-home.transmeta.com [10.64.7.194]) by deepthought.transmeta.com (8.11.6/8.11.6) with ESMTP id h2RJNja22184; Thu, 27 Mar 2003 11:23:45 -0800 (PST) Date: Thu, 27 Mar 2003 11:22:55 -0800 (PST) From: Linus Torvalds To: "David S. Miller" cc: dane@aiinet.com, , , , , , , , Subject: Re: BUG or not? GFP_KERNEL with interrupts disabled. In-Reply-To: <20030327.111012.23672715.davem@redhat.com> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) X-archive-position: 2077 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: torvalds@transmeta.com Precedence: bulk X-list: netdev Content-Length: 704 Lines: 23 On Thu, 27 Mar 2003, David S. Miller wrote: > > Let's codify this "in_atomic() || irqs_disabled()" test into a macro > that everyone can use to test sleepability, ok? Well, I really don't want people to act dynamically differently depending on whether they can sleep or not. That makes static sanity-testing impossible. So I really think that the only really valid use of the above is on one single place: might_sleep(). Which right now doesn't do the "irqs_disabled()" test, but otherwise looks good. So the code should really just say if (gfp_mask & __GFP_WAIT) might_sleep(); and might_sleep() should be updated. Anybody want to try that and see whether things break horribly? Linus From davem@redhat.com Thu Mar 27 11:43:22 2003 Received: with ECARTIS (v1.0.0; list netdev); Thu, 27 Mar 2003 11:43:27 -0800 (PST) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h2RJgfq9005227 for ; Thu, 27 Mar 2003 11:43:22 -0800 Received: from localhost (IDENT:davem@localhost.localdomain [127.0.0.1]) by pizda.ninka.net (8.9.3/8.9.3) with ESMTP id LAA04627; Thu, 27 Mar 2003 11:39:34 -0800 Date: Thu, 27 Mar 2003 11:39:33 -0800 (PST) Message-Id: <20030327.113933.123322481.davem@redhat.com> To: torvalds@transmeta.com Cc: dane@aiinet.com, shmulik.hen@intel.com, bonding-devel@lists.sourceforge.net, bonding-announce@lists.sourceforge.net, netdev@oss.sgi.com, linux-kernel@vger.kernel.org, linux-net@vger.kernel.org, mingo@redhat.com, kuznet@ms2.inr.ac.ru Subject: Re: BUG or not? GFP_KERNEL with interrupts disabled. From: "David S. Miller" In-Reply-To: References: <20030327.111012.23672715.davem@redhat.com> X-FalunGong: Information control. X-Mailer: Mew version 2.1 on Emacs 21.1 / Mule 5.0 (SAKAKI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) X-archive-position: 2078 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev Content-Length: 410 Lines: 14 From: Linus Torvalds Date: Thu, 27 Mar 2003 11:22:55 -0800 (PST) if (gfp_mask & __GFP_WAIT) might_sleep(); and might_sleep() should be updated. Anybody want to try that and see whether things break horribly? I hadn't considered this, good idea. I'm trying this out right now. Someone should backport the might_sleep() stuff to 2.4.x, it's very useful. From davem@redhat.com Thu Mar 27 11:57:02 2003 Received: with ECARTIS (v1.0.0; list netdev); Thu, 27 Mar 2003 11:57:06 -0800 (PST) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h2RJuMq9014098 for ; Thu, 27 Mar 2003 11:57:02 -0800 Received: from localhost (IDENT:davem@localhost.localdomain [127.0.0.1]) by pizda.ninka.net (8.9.3/8.9.3) with ESMTP id LAA04696; Thu, 27 Mar 2003 11:53:15 -0800 Date: Thu, 27 Mar 2003 11:53:14 -0800 (PST) Message-Id: <20030327.115314.121599027.davem@redhat.com> To: rml@tech9.net Cc: torvalds@transmeta.com, dane@aiinet.com, shmulik.hen@intel.com, bonding-devel@lists.sourceforge.net, bonding-announce@lists.sourceforge.net, netdev@oss.sgi.com, linux-kernel@vger.kernel.org, linux-net@vger.kernel.org, mingo@redhat.com, kuznet@ms2.inr.ac.ru Subject: Re: BUG or not? GFP_KERNEL with interrupts disabled. From: "David S. Miller" In-Reply-To: <1048794730.775.14.camel@localhost> References: <20030327.113933.123322481.davem@redhat.com> <1048794730.775.14.camel@localhost> X-FalunGong: Information control. X-Mailer: Mew version 2.1 on Emacs 21.1 / Mule 5.0 (SAKAKI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) X-archive-position: 2079 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev Content-Length: 491 Lines: 15 From: Robert Love Date: 27 Mar 2003 14:52:11 -0500 On Thu, 2003-03-27 at 14:39, David S. Miller wrote: > I hadn't considered this, good idea. I'm trying this out right now. I hope it works. I have a sinking feeling we call it some places that may have interrupts disabled... Your sinking feeling was warranted. Nearly every hw IRQ implementation invokes irq_exit() with CPU interrupts off :-( That has to be screwing with performance as well. From davem@redhat.com Thu Mar 27 12:59:15 2003 Received: with ECARTIS (v1.0.0; list netdev); Thu, 27 Mar 2003 12:59:18 -0800 (PST) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h2RKwXq9020455 for ; Thu, 27 Mar 2003 12:59:14 -0800 Received: from localhost (IDENT:davem@localhost.localdomain [127.0.0.1]) by pizda.ninka.net (8.9.3/8.9.3) with ESMTP id MAA07577; Thu, 27 Mar 2003 12:55:08 -0800 Date: Thu, 27 Mar 2003 12:55:07 -0800 (PST) Message-Id: <20030327.125507.104718048.davem@redhat.com> To: torvalds@transmeta.com Cc: dane@aiinet.com, shmulik.hen@intel.com, bonding-devel@lists.sourceforge.net, bonding-announce@lists.sourceforge.net, netdev@oss.sgi.com, linux-kernel@vger.kernel.org, linux-net@vger.kernel.org, mingo@redhat.com, kuznet@ms2.inr.ac.ru Subject: Re: BUG or not? GFP_KERNEL with interrupts disabled. From: "David S. Miller" In-Reply-To: <20030327.113933.123322481.davem@redhat.com> References: <20030327.111012.23672715.davem@redhat.com> <20030327.113933.123322481.davem@redhat.com> X-FalunGong: Information control. X-Mailer: Mew version 2.1 on Emacs 21.1 / Mule 5.0 (SAKAKI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) X-archive-position: 2080 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev Content-Length: 4792 Lines: 168 From: "David S. Miller" Date: Thu, 27 Mar 2003 11:39:33 -0800 (PST) From: Linus Torvalds Date: Thu, 27 Mar 2003 11:22:55 -0800 (PST) if (gfp_mask & __GFP_WAIT) might_sleep(); and might_sleep() should be updated. Anybody want to try that and see whether things break horribly? I hadn't considered this, good idea. I'm trying this out right now. Ok, I'm running this now and it appears to work. i386 will need similar changes to it's irq_exit() call sites. One might_sleep still triggers, the cpufreq_register_notifier() call during boot. It takes a rwsem. This will trigger on i386 too with TSC and CPUFREQ both enabled. Oh yeah, the fbcon cursor thing triggers too, but that's been discussed to death in another thread and hopefully a fix will be pushed upstream soon by the fbcon guys. --- ./arch/sparc64/kernel/smp.c.~1~ Thu Mar 27 11:57:41 2003 +++ ./arch/sparc64/kernel/smp.c Thu Mar 27 12:05:46 2003 @@ -1055,11 +1055,10 @@ void smp_percpu_timer_interrupt(struct p clear_softint(tick_mask); } + irq_enter(); do { sparc64_do_profile(regs); if (!--prof_counter(cpu)) { - irq_enter(); - if (cpu == boot_cpu_id) { kstat_cpu(cpu).irqs[0]++; timer_tick_interrupt(regs); @@ -1067,7 +1066,6 @@ void smp_percpu_timer_interrupt(struct p update_process_times(user); - irq_exit(); prof_counter(cpu) = prof_multiplier(cpu); } @@ -1088,6 +1086,9 @@ void smp_percpu_timer_interrupt(struct p : /* no outputs */ : "r" (pstate)); } while (time_after_eq(tick, compare)); + + local_irq_enable(); + irq_exit(); } static void __init smp_setup_percpu_timer(void) --- ./arch/sparc64/kernel/irq.c.~1~ Thu Mar 27 11:57:41 2003 +++ ./arch/sparc64/kernel/irq.c Thu Mar 27 12:42:13 2003 @@ -356,7 +356,7 @@ int request_irq(unsigned int irq, void ( } if (action == NULL) action = (struct irqaction *)kmalloc(sizeof(struct irqaction), - GFP_KERNEL); + GFP_ATOMIC); if (!action) { spin_unlock_irqrestore(&irq_action_lock, flags); @@ -376,7 +376,7 @@ int request_irq(unsigned int irq, void ( goto free_and_ebusy; } if ((bucket->flags & IBF_MULTI) == 0) { - vector = kmalloc(sizeof(void *) * 4, GFP_KERNEL); + vector = kmalloc(sizeof(void *) * 4, GFP_ATOMIC); if (vector == NULL) goto free_and_enomem; @@ -793,6 +793,7 @@ void handler_irq(int irq, struct pt_regs bp->flags &= ~IBF_INPROGRESS; } + local_irq_enable(); irq_exit(); } @@ -900,7 +901,7 @@ int request_fast_irq(unsigned int irq, } if (action == NULL) action = (struct irqaction *)kmalloc(sizeof(struct irqaction), - GFP_KERNEL); + GFP_ATOMIC); if (!action) { spin_unlock_irqrestore(&irq_action_lock, flags); return -ENOMEM; --- ./arch/sparc64/kernel/traps.c.~1~ Thu Mar 27 12:13:23 2003 +++ ./arch/sparc64/kernel/traps.c Thu Mar 27 12:15:53 2003 @@ -1575,6 +1575,9 @@ void show_trace_raw(struct thread_info * struct reg_window *rw; int count = 0; + if (tp == current_thread_info()) + flushw_all(); + fp = ksp + STACK_BIAS; thread_base = (unsigned long) tp; do { @@ -1595,6 +1598,15 @@ void show_trace_task(struct task_struct if (tsk) show_trace_raw(tsk->thread_info, tsk->thread_info->ksp); +} + +void dump_stack(void) +{ + unsigned long ksp; + + __asm__ __volatile__("mov %%fp, %0" + : "=r" (ksp)); + show_trace_raw(current_thread_info(), ksp); } void die_if_kernel(char *str, struct pt_regs *regs) --- ./kernel/sched.c.~1~ Thu Mar 27 11:27:01 2003 +++ ./kernel/sched.c Thu Mar 27 11:27:41 2003 @@ -2554,7 +2554,7 @@ void __might_sleep(char *file, int line) #if defined(in_atomic) static unsigned long prev_jiffy; /* ratelimiting */ - if (in_atomic()) { + if (in_atomic() || irqs_disabled()) { if (time_before(jiffies, prev_jiffy + HZ)) return; prev_jiffy = jiffies; --- ./kernel/softirq.c.~1~ Thu Mar 27 11:28:20 2003 +++ ./kernel/softirq.c Thu Mar 27 11:52:35 2003 @@ -60,6 +60,9 @@ asmlinkage void do_softirq() if (in_interrupt()) return; + if (irqs_disabled()) + BUG(); + local_irq_save(flags); cpu = smp_processor_id(); --- ./net/core/skbuff.c.~1~ Thu Mar 27 11:28:53 2003 +++ ./net/core/skbuff.c Thu Mar 27 11:29:12 2003 @@ -170,15 +170,8 @@ struct sk_buff *alloc_skb(unsigned int s struct sk_buff *skb; u8 *data; - if (in_interrupt() && (gfp_mask & __GFP_WAIT)) { - static int count; - if (++count < 5) { - printk(KERN_ERR "alloc_skb called nonatomically " - "from interrupt %p\n", NET_CALLER(size)); - BUG(); - } - gfp_mask &= ~__GFP_WAIT; - } + if (gfp_mask & __GFP_WAIT) + might_sleep(); /* Get the HEAD */ skb = skb_head_from_pool(); From davem@redhat.com Thu Mar 27 13:33:02 2003 Received: with ECARTIS (v1.0.0; list netdev); Thu, 27 Mar 2003 13:33:08 -0800 (PST) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h2RLX1q9024164 for ; Thu, 27 Mar 2003 13:33:02 -0800 Received: from localhost (IDENT:davem@localhost.localdomain [127.0.0.1]) by pizda.ninka.net (8.9.3/8.9.3) with ESMTP id NAA09676; Thu, 27 Mar 2003 13:29:55 -0800 Date: Thu, 27 Mar 2003 13:29:54 -0800 (PST) Message-Id: <20030327.132954.96279785.davem@redhat.com> To: torvalds@transmeta.com Cc: dane@aiinet.com, shmulik.hen@intel.com, bonding-devel@lists.sourceforge.net, bonding-announce@lists.sourceforge.net, netdev@oss.sgi.com, linux-kernel@vger.kernel.org, linux-net@vger.kernel.org, mingo@redhat.com, kuznet@ms2.inr.ac.ru Subject: Re: BUG or not? GFP_KERNEL with interrupts disabled. From: "David S. Miller" In-Reply-To: <20030327.125507.104718048.davem@redhat.com> References: <20030327.113933.123322481.davem@redhat.com> <20030327.125507.104718048.davem@redhat.com> X-FalunGong: Information control. X-Mailer: Mew version 2.1 on Emacs 21.1 / Mule 5.0 (SAKAKI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) X-archive-position: 2081 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev Content-Length: 542 Lines: 20 From: "David S. Miller" Date: Thu, 27 Mar 2003 12:55:07 -0800 (PST) Alexey has pointed out a bug in my changes. @@ -1088,6 +1086,9 @@ void smp_percpu_timer_interrupt(struct p : /* no outputs */ : "r" (pstate)); } while (time_after_eq(tick, compare)); + + local_irq_enable(); + irq_exit(); } static void __init smp_setup_percpu_timer(void) Of course this is bogus. The IRQ enable needs to occur in the irq_exit() branch right before do_softirq() is invoked. From jsd@monmouth.com Thu Mar 27 13:48:45 2003 Received: with ECARTIS (v1.0.0; list netdev); Thu, 27 Mar 2003 13:48:52 -0800 (PST) Received: from av8n.net (pcp03191463pcs.midltn01.nj.comcast.net [68.37.175.11]) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h2RLmhq9024734 for ; Thu, 27 Mar 2003 13:48:45 -0800 Received: (qmail 809 invoked from network); 27 Mar 2003 21:48:38 -0000 Received: from localhost (HELO monmouth.com) (127.0.0.1) by localhost with SMTP; 27 Mar 2003 21:48:38 -0000 Message-ID: <3E8371B5.7030200@monmouth.com> Date: Thu, 27 Mar 2003 16:48:37 -0500 From: "John S. Denker" User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.3) Gecko/20030323 X-Accept-Language: en-us, en MIME-Version: 1.0 To: bert hubert CC: netdev Subject: Re: ?completeness of IPsec feature-set References: <3E82DCF7.7090706@monmouth.com> <20030327133659.GA11820@outpost.ds9a.nl> In-Reply-To: <20030327133659.GA11820@outpost.ds9a.nl> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) X-archive-position: 2082 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jsd@monmouth.com Precedence: bulk X-list: netdev Content-Length: 1799 Lines: 52 On 03/27/2003 08:36 AM, bert hubert wrote: > > Racoon is just an IKE daemon - Linux is not bound to it. That's true. But until today there had been no discussion on netdev of any userspace tools except KAME, as far as google and I can tell. It seems high time to begin such a discussion. > You are free to write your own. I think before I did that I would throw away all the linux-2.5 built-in IPsec features and use FreeS/WAN, which has a reasonably complete feature-set. It's amusing that some people flame FreeS/WAN, alleging "it's _not_ integrated, and this is a major problem" ... and alleging that the linux-2.5 stuff solves this problem. Somehow I don't understand how telling people to write their own key-exchange daemon is the winning "integrated" solution. > The OpenBSD one (isakpmd) also works under linux. Folks who wish to pursue this option are encouraged to look at http://www.uwsg.iu.edu/hypermail/linux/kernel/0301.3/0582.html which announces a port of isakmpd to linux-2.5, available from http://bender.thinknerd.de/~thomas/isakmpd-linux-2.5/ BSD IPsec in general and isakmpd in particular have a better design and vastly better documentation than KAME. However, the existence of isakmpd does not answer all questions about the completeness of the IPsec feature- set. For example, BSD provides an "enc0" device and documents using it to implement network security rules. Alas I see no sign that linux-2.5 provides this feature. If I am overlooking something, please explain. I ask again: Is there a document somewhere listing the set of desirable features and the status thereof? Or otherwise is there something to reassure would-be users that a complete feature-set will be provided? http://www.monmouth.com/~jsd/vpn/ipsec+routing/feature-list.htm From ahu@outpost.ds9a.nl Thu Mar 27 13:59:21 2003 Received: with ECARTIS (v1.0.0; list netdev); Thu, 27 Mar 2003 13:59:24 -0800 (PST) Received: from outpost.ds9a.nl (postfix@outpost.ds9a.nl [213.244.168.210]) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h2RLweq9025323 for ; Thu, 27 Mar 2003 13:59:21 -0800 Received: by outpost.ds9a.nl (Postfix, from userid 1000) id 9542E45F8; Thu, 27 Mar 2003 22:58:39 +0100 (CET) Date: Thu, 27 Mar 2003 22:58:39 +0100 From: bert hubert To: "John S. Denker" Cc: netdev Subject: Re: ?completeness of IPsec feature-set Message-ID: <20030327215839.GA31029@outpost.ds9a.nl> Mail-Followup-To: bert hubert , "John S. Denker" , netdev References: <3E82DCF7.7090706@monmouth.com> <20030327133659.GA11820@outpost.ds9a.nl> <3E8371B5.7030200@monmouth.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <3E8371B5.7030200@monmouth.com> User-Agent: Mutt/1.3.28i X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) X-archive-position: 2083 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: ahu@ds9a.nl Precedence: bulk X-list: netdev Content-Length: 2048 Lines: 50 On Thu, Mar 27, 2003 at 04:48:37PM -0500, John S. Denker wrote: > I think before I did that I would throw away all the linux-2.5 built-in > IPsec features and use FreeS/WAN, which has a reasonably complete > feature-set. :-) > It's amusing that some people flame FreeS/WAN, alleging "it's _not_ > integrated, and this is a major problem" ... and alleging that the > linux-2.5 stuff solves this problem. Somehow I don't understand how > telling people to write their own key-exchange daemon is the winning > "integrated" solution. I sense some... anger. Linux provides the RFC PF_KEY protocol and also uses the RFC ioctls to support IPSEC. Any compliant IKE will work against it. That is how development works in the kernel. > For example, BSD provides an "enc0" device and documents using it to > implement network security rules. Alas I see no sign that linux-2.5 > provides this feature. If I am overlooking something, please explain. 'enc0' is an internal abstraction, do you need it? > I ask again: Is there a document somewhere listing the set of desirable > features and the status thereof? Or otherwise is there something to > reassure would-be users that a complete feature-set will be provided? Right now, the kernel side of things is nearly complete. I sorely miss IPSEC NAT traversal which appears to be pretty patented. > http://www.monmouth.com/~jsd/vpn/ipsec+routing/feature-list.htm This is mostly about userspace. The current attitude is that the kernel provides the hooks and we then hope people start coding against that interface. A large amount of the things you suggest can be implemented today. Some time ago I took a small shot at porting the freeswan ike to the standardised IPSEC ioctls add PF_KEY protocol but it differed too wildly. It may well be useful to continue this effort. Regards, bert -- http://www.PowerDNS.com Open source, database driven DNS Software http://lartc.org Linux Advanced Routing & Traffic Control HOWTO http://netherlabs.nl Consulting From jsd@monmouth.com Thu Mar 27 14:58:53 2003 Received: with ECARTIS (v1.0.0; list netdev); Thu, 27 Mar 2003 14:59:00 -0800 (PST) Received: from av8n.net (pcp03191463pcs.midltn01.nj.comcast.net [68.37.175.11]) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h2RMwCq9026769 for ; Thu, 27 Mar 2003 14:58:52 -0800 Received: (qmail 1109 invoked from network); 27 Mar 2003 22:58:06 -0000 Received: from localhost (HELO monmouth.com) (127.0.0.1) by localhost with SMTP; 27 Mar 2003 22:58:06 -0000 Message-ID: <3E8381FE.5050603@monmouth.com> Date: Thu, 27 Mar 2003 17:58:06 -0500 From: "John S. Denker" User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.3) Gecko/20030323 X-Accept-Language: en-us, en MIME-Version: 1.0 To: bert hubert CC: netdev Subject: Re: ?completeness of IPsec feature-set References: <3E82DCF7.7090706@monmouth.com> <20030327133659.GA11820@outpost.ds9a.nl> <3E8371B5.7030200@monmouth.com> <20030327215839.GA31029@outpost.ds9a.nl> In-Reply-To: <20030327215839.GA31029@outpost.ds9a.nl> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) X-archive-position: 2084 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jsd@monmouth.com Precedence: bulk X-list: netdev Content-Length: 2582 Lines: 74 On 03/27/2003 04:58 PM, bert hubert wrote: > > Linux provides the RFC PF_KEY protocol and also uses > the RFC ioctls to support IPSEC. Any compliant IKE will work against it. > That is how development works in the kernel. Saying it's not a kernel problem isn't the same as saying it's not a problem. I was told this was the proper list to raise such questions. If it isn't, please point me to a more-appropriate list. Certain parties are touting linux-2.5 IPsec as a complete and "integrated" solution. Either the claims need to be toned down or (preferably) thelevel of integration needs to go up. In any case a "status report" clarifying what remains to be done would be helpful. >>For example, BSD provides an "enc0" device and documents using it to >>implement network security rules. Alas I see no sign that linux-2.5 >>provides this feature. If I am overlooking something, please explain. > > 'enc0' is an internal abstraction, do you need it? We agree is an abstraction... but I wouldn't have called it "internal". It is a documented interface, so the better half of it is external. As to need, I get 1290 hits from http://www.google.com/search?q=enc0 so there is prima facie evidence that people use it. Uses include -- mentioning it in packet-filtering rules. -- using it to communicate with userspace about things like MTU and default source address. -- mentioning it in routing rules. If you have evidence that everything that can be done with enc0 can be conveniently done without enc0, please share it. > Right now, the kernel side of things is nearly complete. I sorely miss IPSEC > NAT traversal which appears to be pretty patented. Do you mean these patents http://www.ietf.org/ietf/IPR/SSH-NAT or others? Also, I have heard reports that NAT-traversal was "coming soon" to linux-2.5. Again, a coherent status report would be helpful. >>http://www.monmouth.com/~jsd/vpn/ipsec+routing/feature-list.htm > > > This is mostly about userspace. The current attitude is that the kernel > provides the hooks and we then hope people start coding against that > interface. A large amount of the things you suggest can be implemented > today. A large amount, yes. But perhaps not all; the enc0 question is a case in point. Again: saying it's not a kernel problem isn't the same as saying it's not a problem. Commonly real-world usability and scalability depend on "making the whole offer". Users really aren't interested in something that provides 60% or even 99% of a working solution if the remainder is not readily available. From jmorris@intercode.com.au Thu Mar 27 15:22:40 2003 Received: with ECARTIS (v1.0.0; list netdev); Thu, 27 Mar 2003 15:22:46 -0800 (PST) Received: from blackbird.intercode.com.au (IDENT:root@blackbird.intercode.com.au [203.32.101.10]) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h2RNLvq9027255 for ; Thu, 27 Mar 2003 15:22:38 -0800 Received: from localhost (jmorris@localhost) by blackbird.intercode.com.au (8.11.6/8.9.3) with ESMTP id h2RNLcu00392; Fri, 28 Mar 2003 10:21:38 +1100 Date: Fri, 28 Mar 2003 10:21:38 +1100 (EST) From: James Morris To: bert hubert cc: "John S. Denker" , netdev Subject: Re: ?completeness of IPsec feature-set In-Reply-To: <20030327215839.GA31029@outpost.ds9a.nl> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) X-archive-position: 2085 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jmorris@intercode.com.au Precedence: bulk X-list: netdev Content-Length: 228 Lines: 15 On Thu, 27 Mar 2003, bert hubert wrote: > Right now, the kernel side of things is nearly complete. I sorely miss IPSEC > NAT traversal Derek Atkins is working on NAT-T. - James -- James Morris From pekkas@netcore.fi Thu Mar 27 22:33:03 2003 Received: with ECARTIS (v1.0.0; list netdev); Thu, 27 Mar 2003 22:33:07 -0800 (PST) Received: from netcore.fi (netcore.fi [193.94.160.1]) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h2S6X0q9001227 for ; Thu, 27 Mar 2003 22:33:02 -0800 Received: from localhost (pekkas@localhost) by netcore.fi (8.11.6/8.11.6) with ESMTP id h2S6Wnd25537; Fri, 28 Mar 2003 08:32:49 +0200 Date: Fri, 28 Mar 2003 08:32:48 +0200 (EET) From: Pekka Savola To: bert hubert cc: "John S. Denker" , netdev Subject: Re: ?completeness of IPsec feature-set In-Reply-To: <20030327215839.GA31029@outpost.ds9a.nl> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) X-archive-position: 2086 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: pekkas@netcore.fi Precedence: bulk X-list: netdev Content-Length: 999 Lines: 23 On Thu, 27 Mar 2003, bert hubert wrote: > > I ask again: Is there a document somewhere listing the set of desirable > > features and the status thereof? Or otherwise is there something to > > reassure would-be users that a complete feature-set will be provided? > > Right now, the kernel side of things is nearly complete. I sorely miss IPSEC > NAT traversal which appears to be pretty patented. Note that at least some IPR-claimants have decreed IPsec NAT-traversal a roualty free technology in interests to promote its use, see http://www.ietf.org/ipr It seems SSH and ipunplugged offer RF terms (to the extent of implementing the IETF standards track implementation, which it isn't at the moment) while Microsoft and Cisco don't. I'm particularly interested if this is considered to be a problem. -- Pekka Savola "You each name yourselves king, yet the Netcore Oy kingdom bleeds." Systems. Networks. Security. -- George R.R. Martin: A Clash of Kings From Robert.Olsson@data.slu.se Fri Mar 28 00:28:08 2003 Received: with ECARTIS (v1.0.0; list netdev); Fri, 28 Mar 2003 00:28:16 -0800 (PST) Received: from robur.slu.se (robur.slu.se [130.238.98.12]) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h2S8S5q9006100 for ; Fri, 28 Mar 2003 00:28:07 -0800 Received: (from robert@localhost) by robur.slu.se (8.9.3/8.9.3) id JAA11541; Fri, 28 Mar 2003 09:27:39 +0100 From: Robert Olsson MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Message-ID: <16004.1914.582979.278009@robur.slu.se> Date: Fri, 28 Mar 2003 09:27:38 +0100 To: Jason Lunz Cc: netdev@oss.sgi.com Subject: Re: [Fwd: [E1000] NAPI re-insertion w/ changes] In-Reply-To: References: <16003.11449.497905.815776@robur.slu.se> X-Mailer: VM 6.92 under Emacs 19.34.1 X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) X-archive-position: 2087 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: Robert.Olsson@data.slu.se Precedence: bulk X-list: netdev Content-Length: 457 Lines: 19 Jason Lunz writes: > I've seen pretty much the same thing. I plotted throughput vs. offered > load for e1000 4.4.12-k1, 4.4.19-k3, and 5.0.43-k1 (all backported to > 2.4.20). A summary with graphs is at: > > http://gtf.org/lunz/linux/net/perf/ Illustrative. > 5.0.43 seems to be a significant regression, both in terms of throughput > and CPU load. Can you test the patch I sent for 5.0.43 with your equipment/setup? Cheers. --ro From ahu@outpost.ds9a.nl Fri Mar 28 02:51:32 2003 Received: with ECARTIS (v1.0.0; list netdev); Fri, 28 Mar 2003 02:51:37 -0800 (PST) Received: from outpost.ds9a.nl (postfix@outpost.ds9a.nl [213.244.168.210]) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h2SApLq9012478 for ; Fri, 28 Mar 2003 02:51:22 -0800 Received: by outpost.ds9a.nl (Postfix, from userid 1000) id 7C56F3FA3; Fri, 28 Mar 2003 11:19:57 +0100 (CET) Date: Fri, 28 Mar 2003 11:19:57 +0100 From: bert hubert To: Pekka Savola Cc: "John S. Denker" , netdev Subject: Re: ?completeness of IPsec feature-set Message-ID: <20030328101957.GA11075@outpost.ds9a.nl> Mail-Followup-To: bert hubert , Pekka Savola , "John S. Denker" , netdev References: <20030327215839.GA31029@outpost.ds9a.nl> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.3.28i X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) X-archive-position: 2088 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: ahu@ds9a.nl Precedence: bulk X-list: netdev Content-Length: 975 Lines: 26 On Fri, Mar 28, 2003 at 08:32:48AM +0200, Pekka Savola wrote: > It seems SSH and ipunplugged offer RF terms (to the extent of implementing > the IETF standards track implementation, which it isn't at the moment) > while Microsoft and Cisco don't. > > I'm particularly interested if this is considered to be a problem. The suggestion from Linus is to continue coding and leave this to people who can actually read and understand legalese. We are not qualified to determine what is allowed and what is not. There is some precedent, IBM holds a blanket patent on 'compression' but has promised not to enforce it, I think. There is libz in the kernel. But leave it to the people who care and code onwards is what Linus says and I would tend to agree. Regards, bert -- http://www.PowerDNS.com Open source, database driven DNS Software http://lartc.org Linux Advanced Routing & Traffic Control HOWTO http://netherlabs.nl Consulting From toml@us.ibm.com Fri Mar 28 07:22:19 2003 Received: with ECARTIS (v1.0.0; list netdev); Fri, 28 Mar 2003 07:22:26 -0800 (PST) Received: from e3.ny.us.ibm.com (e3.ny.us.ibm.com [32.97.182.103]) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h2SFLWq9020210 for ; Fri, 28 Mar 2003 07:22:19 -0800 Received: from northrelay01.pok.ibm.com (northrelay01.pok.ibm.com [9.56.224.149]) by e3.ny.us.ibm.com (8.12.8/8.12.2) with ESMTP id h2SFKlkD149456; Fri, 28 Mar 2003 10:20:47 -0500 Received: from tomlt2.austin.ibm.com (tomlt2.austin.ibm.com [9.41.94.20]) by northrelay01.pok.ibm.com (8.12.8/NCO/VER6.5) with ESMTP id h2SFKf1Z035278; Fri, 28 Mar 2003 10:20:42 -0500 Subject: [PATCH] IPSec: Missing IPv6 policy checks From: Tom Lendacky To: netdev@oss.sgi.com Cc: davem@redhat.com, kuznet@ms2.inr.ac.ru, toml@us.ibm.com Content-Type: text/plain Content-Transfer-Encoding: 7bit X-Mailer: Ximian Evolution 1.0.8 (1.0.8-10) Date: 28 Mar 2003 09:22:15 -0600 Message-Id: <1048864940.14454.10.camel@tomlt2.tomloffice.austin.ibm.com> Mime-Version: 1.0 X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) X-archive-position: 2089 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: toml@us.ibm.com Precedence: bulk X-list: netdev Content-Length: 4449 Lines: 145 Below is a patch for your consideration for some policy checks that are missing (as compared to the IPv4 code). This patch fixes some of the tunnel mode problems I've been encountering. I'm not completely sure about the change to ip6_output.c as far as the placement of the xfrm6_route_forward call within the ip6_forward function. Please review and let me know if I should make any changes. Thanks, Tom diff -ur linux-2.5.66-orig/include/net/protocol.h linux-2.5.66/include/net/protocol.h --- linux-2.5.66-orig/include/net/protocol.h 2003-03-24 16:00:20.000000000 -0600 +++ linux-2.5.66/include/net/protocol.h 2003-03-27 16:19:33.000000000 -0600 @@ -50,6 +50,7 @@ struct inet6_skb_parm *opt, int type, int code, int offset, __u32 info); + int no_policy; }; #endif diff -ur linux-2.5.66-orig/net/ipv6/ah6.c linux-2.5.66/net/ipv6/ah6.c --- linux-2.5.66-orig/net/ipv6/ah6.c 2003-03-24 16:00:56.000000000 -0600 +++ linux-2.5.66/net/ipv6/ah6.c 2003-03-27 16:20:40.000000000 -0600 @@ -330,6 +330,7 @@ static struct inet6_protocol ah6_protocol = { .handler = xfrm6_rcv, .err_handler = ah6_err, + .no_policy = 1, }; int __init ah6_init(void) diff -ur linux-2.5.66-orig/net/ipv6/esp6.c linux-2.5.66/net/ipv6/esp6.c --- linux-2.5.66-orig/net/ipv6/esp6.c 2003-03-24 16:00:52.000000000 -0600 +++ linux-2.5.66/net/ipv6/esp6.c 2003-03-27 16:21:05.000000000 -0600 @@ -499,6 +499,7 @@ static struct inet6_protocol esp6_protocol = { .handler = xfrm6_rcv, .err_handler = esp6_err, + .no_policy = 1, }; int __init esp6_init(void) diff -ur linux-2.5.66-orig/net/ipv6/ip6_input.c linux-2.5.66/net/ipv6/ip6_input.c --- linux-2.5.66-orig/net/ipv6/ip6_input.c 2003-03-24 16:01:13.000000000 -0600 +++ linux-2.5.66/net/ipv6/ip6_input.c 2003-03-27 16:22:28.000000000 -0600 @@ -43,6 +43,7 @@ #include #include #include +#include @@ -149,7 +150,14 @@ hash = nexthdr & (MAX_INET_PROTOS - 1); if ((ipprot = inet6_protos[hash]) != NULL) { - int ret = ipprot->handler(&skb); + int ret; + + if (!ipprot->no_policy && + !xfrm6_policy_check(NULL, XFRM_POLICY_IN, skb)) { + kfree_skb(skb); + return 0; + } + ret = ipprot->handler(&skb); if (ret < 0) { nexthdr = -ret; goto resubmit; @@ -157,9 +165,11 @@ IP6_INC_STATS_BH(Ip6InDelivers); } else { if (!raw_sk) { - IP6_INC_STATS_BH(Ip6InUnknownProtos); - icmpv6_param_prob(skb, ICMPV6_UNK_NEXTHDR, - offsetof(struct ipv6hdr, nexthdr)); + if (xfrm6_policy_check(NULL, XFRM_POLICY_IN, skb)) { + IP6_INC_STATS_BH(Ip6InUnknownProtos); + icmpv6_param_prob(skb, ICMPV6_UNK_NEXTHDR, + offsetof(struct ipv6hdr, nexthdr)); + } } else { IP6_INC_STATS_BH(Ip6InDelivers); kfree_skb(skb); diff -ur linux-2.5.66-orig/net/ipv6/ip6_output.c linux-2.5.66/net/ipv6/ip6_output.c --- linux-2.5.66-orig/net/ipv6/ip6_output.c 2003-03-24 15:59:56.000000000 -0600 +++ linux-2.5.66/net/ipv6/ip6_output.c 2003-03-27 16:22:45.000000000 -0600 @@ -50,6 +50,7 @@ #include #include #include +#include static __inline__ void ipv6_select_ident(struct sk_buff *skb, struct frag_hdr *fhdr) { @@ -747,6 +748,9 @@ if (ipv6_devconf.forwarding == 0) goto error; + if (!xfrm6_policy_check(NULL, XFRM_POLICY_FWD, skb)) + goto drop; + skb->ip_summed = CHECKSUM_NONE; /* @@ -781,6 +785,9 @@ return -ETIMEDOUT; } + if (!xfrm6_route_forward(skb)) + goto drop; + /* IPv6 specs say nothing about it, but it is clear that we cannot send redirects to source routed frames. */ diff -ur linux-2.5.66-orig/net/ipv6/tcp_ipv6.c linux-2.5.66/net/ipv6/tcp_ipv6.c --- linux-2.5.66-orig/net/ipv6/tcp_ipv6.c 2003-03-24 16:00:45.000000000 -0600 +++ linux-2.5.66/net/ipv6/tcp_ipv6.c 2003-03-27 16:23:08.000000000 -0600 @@ -2193,6 +2193,7 @@ static struct inet6_protocol tcpv6_protocol = { .handler = tcp_v6_rcv, .err_handler = tcp_v6_err, + .no_policy = 1, }; extern struct proto_ops inet6_stream_ops; diff -ur linux-2.5.66-orig/net/ipv6/udp.c linux-2.5.66/net/ipv6/udp.c --- linux-2.5.66-orig/net/ipv6/udp.c 2003-03-27 16:18:57.000000000 -0600 +++ linux-2.5.66/net/ipv6/udp.c 2003-03-27 16:23:12.000000000 -0600 @@ -955,6 +955,7 @@ static struct inet6_protocol udpv6_protocol = { .handler = udpv6_rcv, .err_handler = udpv6_err, + .no_policy = 1, }; #define LINE_LEN 190 From toml@us.ibm.com Fri Mar 28 08:54:02 2003 Received: with ECARTIS (v1.0.0; list netdev); Fri, 28 Mar 2003 08:54:14 -0800 (PST) Received: from e4.ny.us.ibm.com (e4.ny.us.ibm.com [32.97.182.104]) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h2SGrLq9021772 for ; Fri, 28 Mar 2003 08:54:02 -0800 Received: from northrelay04.pok.ibm.com (northrelay04.pok.ibm.com [9.56.224.206]) by e4.ny.us.ibm.com (8.12.8/8.12.2) with ESMTP id h2SGqfhF166376; Fri, 28 Mar 2003 11:52:41 -0500 Received: from tomlt2.austin.ibm.com (tomlt2.austin.ibm.com [9.41.94.20]) by northrelay04.pok.ibm.com (8.12.8/NCO/VER6.5) with ESMTP id h2SGqcOR005558; Fri, 28 Mar 2003 11:52:39 -0500 Subject: [PATCH] IPSec: IPv6 AH/ESP fixes From: Tom Lendacky To: netdev@oss.sgi.com Cc: davem@redhat.com, kuznet@ms2.inr.ac.ru, toml@us.ibm.com Content-Type: text/plain Content-Transfer-Encoding: 7bit X-Mailer: Ximian Evolution 1.0.8 (1.0.8-10) Date: 28 Mar 2003 10:54:16 -0600 Message-Id: <1048870457.16800.5.camel@tomlt2.tomloffice.austin.ibm.com> Mime-Version: 1.0 X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) X-archive-position: 2091 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: toml@us.ibm.com Precedence: bulk X-list: netdev Content-Length: 5521 Lines: 161 Below is a patch for your consideration for some AH/ESP problems that I encountered during tunnel mode testings. Please review and let me know if any changes are required. Thanks, Tom diff -ur linux-2.5.66-orig/net/ipv6/ah6.c linux-2.5.66/net/ipv6/ah6.c --- linux-2.5.66-orig/net/ipv6/ah6.c 2003-03-27 16:20:40.000000000 -0600 +++ linux-2.5.66/net/ipv6/ah6.c 2003-03-27 14:42:29.000000000 -0600 @@ -199,7 +199,7 @@ } } - nexthdr = ah->nexthdr; + nexthdr = ((struct ipv6hdr*)tmp_hdr)->nexthdr = ah->nexthdr; skb->nh.raw = skb_pull(skb, (ah->hdrlen+2)<<2); memcpy(skb->nh.raw, tmp_hdr, hdr_len); skb->nh.ipv6h->payload_len = htons(skb->len - sizeof(struct ipv6hdr)); @@ -287,7 +287,7 @@ x->props.header_len = XFRM_ALIGN8(ahp->icv_trunc_len + AH_HLEN_NOICV); if (x->props.mode) - x->props.header_len += 20; + x->props.header_len += 40; x->data = ahp; return 0; diff -ur linux-2.5.66-orig/net/ipv6/esp6.c linux-2.5.66/net/ipv6/esp6.c --- linux-2.5.66-orig/net/ipv6/esp6.c 2003-03-27 16:21:05.000000000 -0600 +++ linux-2.5.66/net/ipv6/esp6.c 2003-03-27 14:42:29.000000000 -0600 @@ -108,7 +108,7 @@ struct dst_entry *dst = skb->dst; struct xfrm_state *x = dst->xfrm; struct ipv6hdr *iph = NULL, *top_iph; - struct ip_esp_hdr *esph; + struct ipv6_esp_hdr *esph; struct crypto_tfm *tfm; struct esp_data *esp; struct sk_buff *trailer; @@ -154,7 +154,7 @@ esp = x->data; alen = esp->auth.icv_trunc_len; tfm = esp->conf.tfm; - blksize = crypto_tfm_alg_blocksize(tfm); + blksize = (crypto_tfm_alg_blocksize(tfm) + 3) & ~3; clen = (clen + 2 + blksize-1)&~(blksize-1); if (esp->conf.padlen) clen = (clen + esp->conf.padlen-1)&~(esp->conf.padlen-1); @@ -176,7 +176,7 @@ if (x->props.mode) { iph = skb->nh.ipv6h; top_iph = (struct ipv6hdr*)skb_push(skb, x->props.header_len); - esph = (struct ip_esp_hdr*)(top_iph+1); + esph = (struct ipv6_esp_hdr*)(top_iph+1); *(u8*)(trailer->tail - 1) = IPPROTO_IPV6; top_iph->version = 6; top_iph->priority = iph->priority; @@ -184,13 +184,13 @@ top_iph->flow_lbl[1] = iph->flow_lbl[1]; top_iph->flow_lbl[2] = iph->flow_lbl[2]; top_iph->nexthdr = IPPROTO_ESP; - top_iph->payload_len = htons(skb->len + alen); + top_iph->payload_len = htons(skb->len + alen - sizeof(struct ipv6hdr)); top_iph->hop_limit = iph->hop_limit; - memcpy(&top_iph->saddr, (struct in6_addr *)&x->props.saddr, sizeof(struct ipv6hdr)); - memcpy(&top_iph->daddr, (struct in6_addr *)&x->id.daddr, sizeof(struct ipv6hdr)); + memcpy(&top_iph->saddr, (struct in6_addr *)&x->props.saddr, sizeof(struct in6_addr)); + memcpy(&top_iph->daddr, (struct in6_addr *)&x->id.daddr, sizeof(struct in6_addr)); } else { /* XXX exthdr */ - esph = (struct ip_esp_hdr*)skb_push(skb, x->props.header_len); + esph = (struct ipv6_esp_hdr*)skb_push(skb, x->props.header_len); skb->h.raw = (unsigned char*)esph; top_iph = (struct ipv6hdr*)skb_push(skb, hdr_len); memcpy(top_iph, iph, hdr_len); @@ -257,7 +257,7 @@ int esp6_input(struct xfrm_state *x, struct sk_buff *skb) { struct ipv6hdr *iph; - struct ip_esp_hdr *esph; + struct ipv6_esp_hdr *esph; struct esp_data *esp = x->data; struct sk_buff *trailer; int blksize = crypto_tfm_alg_blocksize(esp->conf.tfm); @@ -269,7 +269,7 @@ u8 ret_nexthdr = 0; unsigned char *tmp_hdr = NULL; - if (!pskb_may_pull(skb, sizeof(struct ip_esp_hdr))) + if (!pskb_may_pull(skb, sizeof(struct ipv6_esp_hdr))) goto out; if (elen <= 0 || (elen & (blksize-1))) @@ -301,7 +301,7 @@ skb->ip_summed = CHECKSUM_NONE; - esph = (struct ip_esp_hdr*)skb->data; + esph = (struct ipv6_esp_hdr*)skb->data; iph = skb->nh.ipv6h; /* Get ivec. This can be wrong, check against another impls. */ @@ -336,7 +336,7 @@ } /* ... check padding bits here. Silly. :-) */ - ret_nexthdr = nexthdr[1]; + ret_nexthdr = ((struct ipv6hdr*)tmp_hdr)->nexthdr = nexthdr[1]; pskb_trim(skb, skb->len - alen - padlen - 2); skb->h.raw = skb_pull(skb, 8 + esp->conf.ivlen); skb->nh.raw += 8 + esp->conf.ivlen; @@ -370,7 +370,7 @@ int type, int code, int offset, __u32 info) { struct ipv6hdr *iph = (struct ipv6hdr*)skb->data; - struct ip_esp_hdr *esph = (struct ip_esp_hdr*)(skb->data+offset); + struct ipv6_esp_hdr *esph = (struct ipv6_esp_hdr*)(skb->data+offset); struct xfrm_state *x; if (type != ICMPV6_DEST_UNREACH || @@ -416,7 +416,7 @@ if (x->aalg->alg_key_len == 0 || x->aalg->alg_key_len > 512) goto error; } - if (x->ealg == NULL || x->ealg->alg_key_len == 0) + if (x->ealg == NULL) goto error; esp = kmalloc(sizeof(*esp), GFP_KERNEL); diff -ur linux-2.5.66-orig/net/ipv6/xfrm6_input.c linux-2.5.66/net/ipv6/xfrm6_input.c --- linux-2.5.66-orig/net/ipv6/xfrm6_input.c 2003-03-27 16:08:24.000000000 -0600 +++ linux-2.5.66/net/ipv6/xfrm6_input.c 2003-03-27 13:37:54.000000000 -0600 @@ -186,6 +186,8 @@ xfrm_vec[xfrm_nr++] = x; + iph = skb->nh.ipv6h; + if (x->props.mode) { /* XXX */ if (iph->nexthdr != IPPROTO_IPV6) goto drop; @@ -199,9 +201,11 @@ goto drop; } while (!err); - memcpy(skb->nh.raw, tmp_hdr, hdr_len); - skb->nh.raw[nh_offset] = nexthdr; - skb->nh.ipv6h->payload_len = htons(hdr_len + skb->len - sizeof(struct ipv6hdr)); + if (!decaps) { + memcpy(skb->nh.raw, tmp_hdr, hdr_len); + skb->nh.raw[nh_offset] = nexthdr; + skb->nh.ipv6h->payload_len = htons(hdr_len + skb->len - sizeof(struct ipv6hdr)); + } /* Allocate new secpath or COW existing one. */ if (!skb->sp || atomic_read(&skb->sp->refcnt) != 1) { From linux-netdev@gmane.org Fri Mar 28 09:32:23 2003 Received: with ECARTIS (v1.0.0; list netdev); Fri, 28 Mar 2003 09:32:29 -0800 (PST) Received: from main.gmane.org (main.gmane.org [80.91.224.249]) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h2SHWKq9022874 for ; Fri, 28 Mar 2003 09:32:22 -0800 Received: from list by main.gmane.org with local (Exim 3.35 #1 (Debian)) id 18yxi8-0004iF-00 for ; Fri, 28 Mar 2003 18:32:12 +0100 X-Injected-Via-Gmane: http://gmane.org/ To: netdev@oss.sgi.com Received: from news by main.gmane.org with local (Exim 3.35 #1 (Debian)) id 18yxi4-0004hP-00 for ; Fri, 28 Mar 2003 18:32:08 +0100 From: Jason Lunz Subject: Re: [Fwd: [E1000] NAPI re-insertion w/ changes] Date: Fri, 28 Mar 2003 17:32:08 +0000 (UTC) Organization: PBR Streetgang Message-ID: References: <16003.11449.497905.815776@robur.slu.se> <16004.1914.582979.278009@robur.slu.se> X-Complaints-To: usenet@main.gmane.org User-Agent: slrn/0.9.7.4 (Linux) X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) X-archive-position: 2092 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: lunz@falooley.org Precedence: bulk X-list: netdev Content-Length: 373 Lines: 10 Robert.Olsson@data.slu.se said: > Can you test the patch I sent for 5.0.43 with your equipment/setup? We have a winner! I called the driver with your patch 5.0.43-k1-ro1. I also removed all the cyclesoak lines; all they really show is that throughput is really sensitive to how much CPU time ksoftirqd gets. http://gtf.org/lunz/linux/net/perf/ is still the url. Jason From Robert.Olsson@data.slu.se Fri Mar 28 10:44:44 2003 Received: with ECARTIS (v1.0.0; list netdev); Fri, 28 Mar 2003 10:44:48 -0800 (PST) Received: from robur.slu.se (robur.slu.se [130.238.98.12]) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h2SIi1q9024209 for ; Fri, 28 Mar 2003 10:44:43 -0800 Received: (from robert@localhost) by robur.slu.se (8.9.3/8.9.3) id TAA21260; Fri, 28 Mar 2003 19:43:23 +0100 From: Robert Olsson MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Message-ID: <16004.38858.803274.216457@robur.slu.se> Date: Fri, 28 Mar 2003 19:43:22 +0100 To: Jason Lunz Cc: netdev@oss.sgi.com Subject: Re: [Fwd: [E1000] NAPI re-insertion w/ changes] In-Reply-To: References: <16003.11449.497905.815776@robur.slu.se> <16004.1914.582979.278009@robur.slu.se> X-Mailer: VM 6.92 under Emacs 19.34.1 X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) X-archive-position: 2093 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: Robert.Olsson@data.slu.se Precedence: bulk X-list: netdev Content-Length: 710 Lines: 24 Jason Lunz writes: > Robert.Olsson@data.slu.se said: > > Can you test the patch I sent for 5.0.43 with your equipment/setup? > > We have a winner! I called the driver with your patch 5.0.43-k1-ro1. I > also removed all the cyclesoak lines; all they really show is that > throughput is really sensitive to how much CPU time ksoftirqd gets. > > http://gtf.org/lunz/linux/net/perf/ is still the url. Excellent! We have a flat performance curve at ~350 kpps at any load and packet size again. Of course this can be improved further. I think Intel still did the hard work but left the "fine" tuning and testing for us. :-) I guess we propose the patch to maintainers. Cheers --ro From jgarzik@pobox.com Fri Mar 28 22:39:17 2003 Received: with ECARTIS (v1.0.0; list netdev); Fri, 28 Mar 2003 22:39:24 -0800 (PST) Received: from www.linux.org.uk (IDENT:exim@parcelfarce.linux.theplanet.co.uk [195.92.249.252]) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h2T6cVq9001807 for ; Fri, 28 Mar 2003 22:39:16 -0800 Received: from rdu57-8-131.nc.rr.com ([66.57.8.131] helo=pobox.com) by www.linux.org.uk with esmtp (Exim 4.14) id 18z5SQ-0001Zw-VS; Sat, 29 Mar 2003 01:48:31 +0000 Message-ID: <3E84FB87.7060405@pobox.com> Date: Fri, 28 Mar 2003 20:48:55 -0500 From: Jeff Garzik Organization: none User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.2.1) Gecko/20021213 Debian/1.2.1-2.bunk X-Accept-Language: en MIME-Version: 1.0 To: Marcelo Tosatti CC: netdev@oss.sgi.com Subject: [BK/GNU] net driver fixes and such Content-Type: multipart/mixed; boundary="------------070700040502090906060303" X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) X-archive-position: 2095 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jgarzik@pobox.com Precedence: bulk X-list: netdev Content-Length: 96650 Lines: 1504 This is a multi-part message in MIME format. --------------070700040502090906060303 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit --------------070700040502090906060303 Content-Type: text/plain; name="net-drivers-2.4.txt" Content-Transfer-Encoding: 7bit Content-Disposition: inline; filename="net-drivers-2.4.txt" Linus, please do a bk pull bk://kernel.bkbits.net/jgarzik/net-drivers-2.4 This will update the following files: drivers/net/e100/e100_vendor.h | 319 ----- Documentation/Configure.help | 1 Documentation/networking/bonding.txt | 142 ++ Documentation/networking/e100.txt | 9 Documentation/networking/e1000.txt | 131 +- MAINTAINERS | 13 drivers/net/bonding.c | 574 ++++++---- drivers/net/e100/e100.h | 17 drivers/net/e100/e100_config.c | 23 drivers/net/e100/e100_config.h | 4 drivers/net/e100/e100_eeprom.c | 4 drivers/net/e100/e100_main.c | 304 +++-- drivers/net/e100/e100_phy.c | 38 drivers/net/e100/e100_phy.h | 4 drivers/net/e100/e100_test.c | 2 drivers/net/e100/e100_ucode.h | 2 drivers/net/e1000/e1000.h | 32 drivers/net/e1000/e1000_ethtool.c | 104 - drivers/net/e1000/e1000_hw.c | 1920 +++++++++++++++++++++++++---------- drivers/net/e1000/e1000_hw.h | 303 ++++- drivers/net/e1000/e1000_main.c | 580 ++++++++-- drivers/net/e1000/e1000_osdep.h | 24 drivers/net/e1000/e1000_param.c | 89 + drivers/net/pcnet32.c | 42 drivers/net/tg3.c | 8 drivers/net/via-rhine.c | 135 +- include/linux/if_bonding.h | 9 27 files changed, 3192 insertions(+), 1641 deletions(-) through these ChangeSets: (03/03/28 1.1058) [netdrvr pcnet32] fix multicast on big endian (03/03/28 1.1057) [netdrvr pcnet32] revert to 2.4.19 version Many negative reports on 2.4.20 version indicate the cable cxn state change patch was bad. (03/03/22 1.1006.9.110) [via-rhine] note that Roger is maintainer, in MAINTAINERS (03/03/22 1.1006.9.109) [via-rhine] changelog (03/03/22 1.1006.9.108) [via-rhine] reset logic Since Linus and Jeff raised the issue of PCI posted writes, I cleaned up wait_for_reset() some more. Experiments show that with MMIO, a reset may indeed take seemingly longer -- that is fixed by flushing that buffer. Also, the driver now polls the appropriate register while waiting for the reset to finish. (03/03/22 1.1006.9.107) [via-rhine] fix races This patch addresses two distinct races: - Until now, the driver started the chip for Tx regardless of errors pending in the status register. Not good if an error occured while we were queueing packets -- the chip counter had not been reset, so Tx died. (We can't reliably get an interrupt for every error condition) - The Rhine-II (when under load) frequently produces a Tx descriptor write-back race error. Failing to handle this means waiting for the netdev watchdog. Fixed. In addition, we must wait for the Tx engine to turn off on error conditions before we scavenge the descriptor entries. Failing to do so will typically lead to performance going down to about 10%: Burst, timeout, burst, timeout.. (again, with a Rhine-II under load). (03/03/22 1.1006.9.106) [E1000] Increase default Rx descriptors to 256 * Increase default Rx descriptors from 80 to 256 to give better Rx buffering capability in the case of heavy Rx load with small packets. (03/03/22 1.1006.9.105) [E100] Honor WOL settings in EEPROM * Honor WOL settings in EEPROM: only advertise WOL magic packets if set in EEPROM. (03/03/22 1.1006.9.104) [E100] forced speed/duplex link recover * Bug fix when changing to non-autoneg, device may lose link with some switches, so try to recover link by forcing PHY. (03/03/22 1.1006.9.103) [E100] Remove strong branded marketing strings * Get rid of all of the strong marketing brand strings and replace with simple pci_device_ids table. pci.ids should be the master list of device/ID strings. (03/03/21 1.1006.9.102) [bonding] fixes, cleanups, and minor feature addition Here is a patch against 2.4.21-pre5 for everything added to bonding since pre5 was released. This includes various bugfixes, code cleanups, support for netif_carrier_xxx(), and some minor features. (03/03/21 1.1006.9.101) [netdrvr tg3] fix memleak in DMA test Also, bump version to 1.5. Leak fix contributed by Don Fry @ IBM (03/03/20 1.1006.14.16) [E1000] whitespace fix from previous patches * Corrected indentation from previous patches (03/03/20 1.1006.14.15) [E1000] Controller wake-up thru ASF fix * Fixed controller wake-up through ASF (03/03/20 1.1006.14.14) [E1000] Added Interrupt Throttle Rate tuning support * Added Interrupt Throttle Rate tuning support (03/03/20 1.1006.14.13) [E1000] Added Tx FIFO flush routine * Added method to flush Tx FIFO after link disconnect; the hardware hangs on to Tx skb's that were in flight prior to link loss (03/03/20 1.1006.14.12) [E1000] Whitespace changes * Miscellaneous whitespace changes (03/03/20 1.1006.14.11) [E1000] Compaq to HP branding change * Changed "Compaq" branding to "HP" (03/03/20 1.1006.14.10) [E1000] Read/Write register macro optimizations * Optimized E1000_*_REG macros (03/03/20 1.1006.14.9) [E1000] Tx Descriptor cleanup * Completely clean Tx descriptor to avoid potential dirty descriptor fetching (rare, but possible) (03/03/20 1.1006.14.8) [E1000] Perform single PCI read per interrupt * ISR cleanup; performing single PCI read (03/03/20 1.1006.14.7) [E1000] Modulus math removed * Removed modulus math; decreases CPU utilization, especially on PPC64 [anton@samba.org] (03/03/20 1.1006.14.6) [E1000] Added MII support * Added MII support (03/03/20 1.1006.14.5) [E1000] Added 82541 & 82547 support * Added support for 82541 and 82547 gigabit ethernet adapters (03/03/20 1.1006.14.4) [E1000] IRQ registration fix * Fixed IRQ registration bug; IRQ now registered after resources are acquired (03/03/20 1.1006.14.3) [E1000] Spd/dplx abstraction; eeprom size changes * Setting speed/duplex is now it's own routine * Update ETHTOOL_GEEPROM routine to use new eeprom size variable (03/03/20 1.1006.14.2) [E1000] Version, copyright, changelog and MAINTAINERS * Version, copyright, changelog and MAINTAINERS updates (03/03/20 1.1006.14.1) [E1000] Documentation/networking/e1000.txt updates * Documentation/networking/e1000.txt updates (03/03/20 1.1006.9.99) [E100] ASF wakeup enabled, but only if set in EEPROM On Thu, 20 Mar 2003, Scott Feldman wrote: * Check if ASF is enable in EEPROM, and if so, enable PME wake up when suspending. (03/03/20 1.1006.9.98) [E100] ethtool EEPROM and GSTRINGS fix On Thu, 20 Mar 2003, Scott Feldman wrote: * Bug fix: read wrong byte in EEPROM when offset is odd number (03/03/20 1.1006.9.97) [E100] Validate updates to MAC address On Thu, 20 Mar 2003, Scott Feldman wrote: * Validate updates to MAC address as valid ethernet address. (03/03/20 1.1006.9.96) [E100] interurpt handler free fix On Thu, 20 Mar 2003, Scott Feldman wrote: * Bug fix on e100_close when repeating hot remove/hot add from team. Basically need to disable interrupts and unregister handler before shutting h/w down. * Need to mask only the relavent bits in the interrupt status register. (03/03/20 1.1006.9.95) [E100] OS already calcs pseudo-hdr [anton@samba.org] On Thu, 20 Mar 2003, Scott Feldman wrote: * OS already calcs pseudo-header (and we got it wrong) [anton@samba.org] (03/03/20 1.1006.9.94) [E100] Cleanup #include order On Thu, 20 Mar 2003, Scott Feldman wrote: * clean up of #includes (03/03/20 1.1006.9.93) [E100] Add support for VLAN hw offload On Thu, 20 Mar 2003, Scott Feldman wrote: * Add support for VLAN hw offload (03/03/20 1.1006.9.92) [E100] spelling corrections from 2.5 On Thu, 20 Mar 2003, Scott Feldman wrote: * Spelling corrections from 2.5 (03/03/20 1.1006.9.91) [E100] Update version(2.2.21-k1), copyright, changelog On Thu, 20 Mar 2003, Scott Feldman wrote: * Update version, copyright, changelog (03/03/20 1.1006.9.90) [E100] Update Documentation/networking/e100.txt On Thu, 20 Mar 2003, Scott Feldman wrote: * Update Documentation/networking/e100.txt --------------070700040502090906060303 Content-Type: application/octet-stream; name="netdrvr.tar.bz2" Content-Transfer-Encoding: base64 Content-Disposition: attachment; filename="netdrvr.tar.bz2" QlpoOTFBWSZTWU99x2sB/DT//////3///////////v/////oSQAAhCAAEAwMEAiWImgIYVdc t9OADwoDQAAb2aiAAAAo3AfbjoMy3scd7u7GRTWw2grbWrMffa9ew23S9zvmB6HTEz5zOb7N 81ZUr3fEcNReNo9GUXoMm895ezzjPqVXYCr3gengB5H3BYdfLu3a3t9w+qSK1QYhRNaHpmjq gZ7Wm21On3YH3zB9H0oByDrz59wDr0eXQHRoChVfZLAp9ANPJib7uK9BoAFOxgFAAAOgUYfQ NBFSoSaaovfR7x9dB69wgR7tm0nsDLbWx0fIAHoBoABQAfO9AYPTA+9Zp8Aafe973bPQABAA G89W1n0to+Hyik++Le7z61499c9Jdu3oMeHp97LZXdLbdtIA6zdNCcPcL3Xz3tnwLx8+m9ff VWty7tPlvdubfdqj7bW1hUgpTbNaKUlvrBtjoPvrOg8mPvueWgeAZAA+hoOgAHc+ifWL3d6X TRwXHMx2uvd3bX3HennvdJ9yx9Htjve94fXNZvcfXL1VrGhWd11bm1fOdVNzTrIZtSJW02p7 7O8sz12ndLMabvfb3p82lPe+96+gCNtrem7Xe96Hd5k+++4z699vbLK2312+97zy56973ez3 d496vt3Xcfb3fV9b13S3tu93vXvOp66h973N8xmZNfe2HtsPu5y6763eG+Nbzt9u+u7IZ7ku 93ycHxtvne556etfbPPPbkFDoX2dbjjoa+2+zwe969zDebvgPU95aXAVQAAoFPe081T17uNr 19PeAAmvHbX20++09nr2enweZr117wHG752EHddul1Q6dNDTNZdrg7ugddGba7Zxdqt3d3AY brVdnV50eHc93bud9YLz3O9h73u1LdfXjy3b758+9Ytu7a069we99Pjtepxamavr3LSgenU5 hgauNRT1p4hqp7sOL211oaAb13R3J0Ze7vb3ub6++tWt7u2yXTW3RUqqE9490jxm2wO7OPYN L2PSbvvb4AegANApN33ve69rXb6699fc29gtmo+7vNz50AvsoADT6aX09bVr1vXu8qt3NbPe 9D2esHvZ1ltffe97vu9285rNu7r6Ojlerpbjuq1vjcejg65nvUHdt733sEppAgBAAQAJkAEw gAJoxBo0mk9BJPZFNkTanqMgbSAAAAAlNAghCE0CJtSeQNR6RM1T9NGkxTYajI1PTQahkepo aABo00GgAAAABJopEJJkBMETJo0anqb1TTIam1HqeiGmg8k9NTJo00NABppkAHqZAAAAAIUi IgTIAgAJkaE9Jpp6UZqnkzSm8mpptRT09KfpQenqmRofqj1H6iepkaaYQAAA0ESJTCCaCnhG QDVMJmU2mjVPSeQm1MSeGqj9MplGwo9JpptTMKPUyHqbaoA9QaPUD1PKCJEQQEGhBMET1Noi p+1NGjU2VDYo8p6NT1AemoGjR6jQAAAAAAAA/6e6/tVDMv+8KH7uflFD8/BqIimKOVQzKYkK EhLoirSKofuH1dP14q5CUBOjQpMDDwTDCSD2+/3vd8IqMJN8ZH41NSm+KwicF1My9mThHAMk k2gOf9Icg6eEPDKOYghEyZOYoc4KhznDiY4Xzzk51jXMiWIibfBUW6urwTMxNWUsPWHKqleC qHupt1eIlfDkYa+UdQziEMzAgRyPAMCANwjhYQfcWjJUM0MzCBMtCjEqcEBlMULMDEqtKUIm pcgM3YaJRmKCqBhkaoJgaQCJWhAoWhocFUtyJoDCAMKUFLzYiaGiEHUhDJEgFATIFA0UNFFB QEMCUDSFAQVIxQS0lCBEKRAFJn8GgDQCzImogkmigogoDCCkpplh/kLRK0oQSSMQUgQocwdY 4gckVENkAlCANAjzHMqI8JCEDf3UHVa1iqiMav95qYsqpZSlKMYg2oK2xiq0qBbUYDWtKRGx g1LFtZYgxKUtpRG0oNWyrUlYUW2LFoyog0/QZlVUVUVaZIZIKEm/5LTfz/5/xePB0Mh2hRlp WVKWRlBRFFFLUhWyilCiULUKUtRYrIiNttFqKqIjFESttsbSxtWrGo2VltFBLKxVb6/mUwih x5jNXXLourTTTZC2FtG7VqKVKtYu0/on85aPlp2p1aKwJHsqNmMNqKfENTPNa4TZixtLSwd1 Pks8f6f+7OddBCg/mX0KUJcui3ISQIQgS/fGUTMKPMs2baygpSrEVM2KVtZYWgFkwmBtAUxd QqG2uwlFwVwiDNlKiWUmywc5ExUutupaFmRZrlhtUFKZFLaXYXCSiNRK5CipbmXGiamjjVZt KxRc1kqsclbTYq62JR2y63Y2btigrXGWKsGKYCwrFU1rWFNrGtm2xkRyURE1GMc3KulKYaiK NLRNW3GumusuLGoqtc3F+NqCZBS8rSpiqoa0yqV2WuCzFsxXW6OCrqmwMlTOjpTFIZ1zbRbJ hssssKhYh++l5z5HjPbrnN0P0/2u5zxbe9HFlGmpDI7OAZ/MKhAfbUFE5Z1qJT3GDu0xSS9T qct1UKqrpSquDbOmG62rYWVdnZcW3MzCywtNZ8/zEaOpTro2E9QUc05Zh+rx59mzIevRcPT3 BxEbwMynLagpMJEAdSgcvVzdT0aapRdH7NfNJnUlMdxUh1JoSp0mo65BeuDTPI81ysNrEQ5q 2/Xyzjetg6u4cYrvw0rMk+h+GJYxH+t9lf+P5KqqqoqqqqqKiqqypaqqqqqqqrcoiif7jXlp AqlKMszDCmar/Sc/g+j+nvD2/8b/w/8N/r+Pz1T7YD1+AGH0hHsnUOQZDR7gSJAI9AeghfSR erMrMGzDLJKGiIKqR+80UqUSctFWKJeFwpFEVYqKpFWKiqi068j5hT2S036JTjuWmntv97h0 dUVGVAWCgh1rbSaoNGypRgI0VZqNm1RQQS6taNEpVuxmiElS0UZGKmuky2SqqqtMa6A2ltA0 SfgAO+gwnUpZWttQVCywH+pt8CQDx1wOERBqrSm2kygm2yIRK2NGitEFoLFNHFsyiO2utxrR bvvsKeLCwQYwOWdYda5NizVQsKbKlwoO2ltgVgiVi5BYiZVQrWxFFbsCW1bRlKXUtaqkY7at tdVYCW0yItKms2osxZttoNxo6ZtzS4P5PjnXR05tUtu22TVtiojrrsW4dts5M2h4/Q/POTnV FO9QNr+6z2edL1hNrV2MXloZnEojC5xkIi3KuB1A1uQKgKWu0kysiqBiNEFbTWzMyC0BJMyB rjW5jpNdXTFpnIiriyop4nXMLBmD4iHOQ4MOu9DGtZbUdbDDKxRgyluVM5KgsFWINdDBjDjT TAo3POcViwiqYtrbOFUqYyVTQqs1uFjGCILJkEZi4oGX49zcS2xB2XE0YWKzGDEhG1ormVVV VVVVUVVVVNUAqsIquzEqsFCyI2mVlEDFSwZWmQMqQbTQxLVoWkLFhkG0xYOYUg0blHA1mhaC DAUEkMc50Qnv64L2vyPq/5KSmq/5upQgIK5Qh6XP4/8T5B/a+/PX4yX0lmYPxIiq/S1Hj/Oq ViojrMRIjLdRikvPD+n+H2eM7xgYhkJJaYhcS5REbTt4HJVdV7fdLXp6MSoUGaaNVSth6p53 EMqNwJ3hMJEd4rnJEN/fjt5/7/n7s2dnH7w6GTjwfM0xH1qA8LydqtU8IIwrgu7dMK06RCi7 JJJD2PFFKKggUO8v87wVQ6iChYdeJpTWJg94YKiYy8GzMKnaSSCCqma3RVyQ4zp4dJ09TBLp wVWcqBdUtyj3EJRzUXf74erqrj0593yf08Ma0bfPtymrc4Kn+SM0w1C2oIIgocjpalVF2fW9 R44Ufb10qoHEHss7TkR+2ncFnoaj4Z2N7KSr8b+TcWKPutEfZetpjxuebOnWX4mc7nDFfY88 +zmTKyeTy3GtWeeXFHl8Zs3vOVToEkgSQk6QS48veXunEyBervHMVczwlFIlTaTbnfu6XKq9 29d83CUYCiM2bmZQ3tfMN18Z3KIpOWz1rmKVIDQtRQ7Q2t75j/b9/Zh9qjXwyLIsUWd6marU a2Ip4KWUdq01mxYUbaZl1EbaCCHHTkENBv+uPNZ6c+f16/Er7HvX/bAIGoxMEwsR2RNxoKLu CccdOmHHo4iGaH0j2L/UXU+P31abzhEF9Qz8VUTH7frV5/Rus2o+nn8tV4vlhJiQQNPXITB7 5/Y5vz/E7Uf1DT6OOIkjyJN6mklIkIVHDiQHSXQ4OhCP++iUNyg2UdsXSitXfjPme3KoyX1L NaCoqKB6attT9G11qxCywaCipqacskoCYaKH/il71iiENnZXdKP+H74cm9ef47pZnl0fbfzL +nVXn/lGVP8OqFGuXzRg24g7x/Y8DmpDx90IUmknTorNEYeC/7ilN5f+CBuR7I/faAd9bAJJ V+N/FSR1xJJ3dppIdCiL84MQLS/slrRhWiHZ1nB/gsgE1UervDCDXzg6BJQf0tPGR0hneNzK /HlQiAiX0miXTriM4sxP95BCRSHbX4pDG3VOXL5QpdUiKlqT/kRHUOeb3zSutKc48+7u9nZ2 nkto1CpRIVUqhshKKKqlKjGLUOjjPh+L72Dd1Yr6deyBp/XPmicKPSkJ0JfL8UuaOU0MkVoO gmm1/Rr1KGghBrQyEwiHiEn98tJR7dOQwe7NZkrKHrNEXwddrwZ2d6smpQrk8XcO16LbJ9d9 /DR8mWpJbdDiy/3VOHVjigMhGU8nAlAfLNdm5/dHctf0oL/rue5XwObPtYq/2yhaniI+3/f8 kFkoHzadmXHCnkzc2MZV8MFh1QbOwEP3OwHPmk9VazIiXxdl1Jo2uX/V5P2ToOzvo8aHj3V9 2a9tk9vrPRYPGv5d/7O8OKh8S8qBx0e1OYcK08ou4rH0r9AFpXl0yH28AaTNvdCd0J0R3wfy V+SFfDGCQdzjpMuCPm5dfgtTEmp4/ix1HSYrbBI5Yr/f+/o9/b6Kf8KKlUY2n6K2Pu1uRHTC Fsg68XxwaiimdEvsylI1eWm+T+VMTZ3giAiKg9lVuiFKb59+d0u7s/fzf8kEkCdOgSyj9FcK sI1xjOH1EVA6fcPwRCtffPTpTHMpRPYOQRBF72svQyURDiZISGR47/LLxNLqlX5ut1tKoUcH TAmTPqWlQ45MME7tSsPm6pDjVEnzIYA7vu28XkvykLyoSKiCbC59iCKvZM6R7l9ewYdBE/f4 Pi/s1mgK3HJLrbOqj7PB5K3SdIdOmde2Iwl2j95m+iV4Zhj0+/n0YOI7owJu79LOIP2/NjCe EBp24o3vSzgcFIqwrRKt8W0s4jb925+93ZqT5iZBUFEFIFqGQ849+hSi0IIwPOpM/k3zv4X9 Yk+t6VQ5AYOHT3xlBJCjHug0U3f4uH0eL7pSnRJlSkI1q3umpBhNrW61Wo4zqg43vBLJCTMm TDJB/YMwXjHbDoOZP0qC9e0+cXRTGE11pw/Rc99L+9Kq0t/dsp3X5v0fDP8eW6ElRpYQaqR0 fRc851WgxnQ0ENVCg+lQMnqxykWInVSxda689pCrUtN7iBg3DMGTM2wGBfoDArwYJgwZtm7n 6IqMefn6KdHSTeFKx5111zqrIRlW9I28yzcnRkw/0LejfBnBdg7uQUvU/uS+9BE/W7ZDK5TB Fp2HiGdloRLu5uNxi5KkVS76/CAfWj7ent8Xr7Lfwq0fbqCDCQyBZ2xzIDRp8/Rw/XfA7x3u G72TDYlSqJfiWmPVt44ggQJ0kLhPDidrQ+6b0oxcA/f8e/6E+zG1ct1sBsop+1GZIgaujP2c 9YEepyD7jZFBbZWRa+XXh5XVVDuDge9BAjIhHs0/+LcUxlLMRB6s9w/f/qcw9ePY5T9W8cIW 063vdUzwSKIhdTK+pyZmGhco5h8+/ifpWbpPTp3TvK7yPD4N8WvC8cl3tu1ZDIUeJvzqfb9n pPzoxDV/14ZswnT0d9c6m71TN0r+XbXC3FLMnfTojpg8YyPG8truUKXxpfDCjqZYjYjdyvWg n+buUdnZ6Ig6SdAk3CO6Ld7ZCMKo6UCCZd18YeVl1SxM4a3rNy9virSmFeKqS3ZO5a/BDvb2 przfKYqXZsl8bktMSqTq0+nd0O6lIohBDImEgzz6WYWSEdKoddLrljbqHDNQgukcm1zUlDNQ 3Flndzu7DB3HBmZHy/H6N9b7ksfxpzKlO+pg/qo3M+j31NZxTPPHmkivTWMy3ddSdzVd3XNZ j3WiOIxkb6rCkj4Z3polaixQO2He9+PWsgsK0iV/Y+W99GPEatVVQU9K+nEXiXXCeBdn4+O/ v4Ofq5aXBJNmSo5/0XSXfk50Y0L6YsiDYQHMo117768KiunpvVjEooi1D+cys428ya/n72BY CdUstk61fR3b0xjIDb7jgmnovX09G6qpxOjz10MdTrI1qbraU1vCx12emr6eZGlUt7+s5meF SvXvexRXCGSwQvifj6aVoWqGuK7fu98831u+6NEaV3dR3f3RIFnOeiMbR0d6nAuG1L5ObMZM 3lNR4xkavse/sUqJ8WDCZG36+UXnLC2qV42/geKp4cf0XxhhsUBKYCH/aPDr043hjKSygSPm OX4n9xOj2pfIVHVbIUBS5NSyssSw6HM7YcPPR2mxwyXHSHT2CopkkkJ1TvUzKZkguWc/pHm7 YZNvUaU6+/z7zPE0KaNZWpiXdm5GJEJrR77rTNKVWr794Skq0yQCQJAkDJUN/Cy0yFn4GUO/ s5iR7Nk9VYGCvvuDg2TGLGg0Si+sE2lEIDMKkITLZEEObcVIkSnjlM8aD1HwbyHD3oYixAuO cM5iSKcMp+1v7O74/Jy+LmeCXvKicaoIHItJtKyzwuS8df5/WaheclEIxg7t5djWIRpHbqjH WAL6/sv1xa9FhTmt4lNUISfojzYSTJJlY6ZlWmfX/cQ6XhC9NNf9Kbv+oq8+Umt57IQMRw+N NCfw6IM7FalDm+RbZfeSfzL1wjVEfhGEYQd7oyI6ovOQkSFLHKpp8HmEN6ZwEENGuENC5JEK 731Y0bWr/C9q5zT1bTEtayoQ+vkXpgV64Z/BQ5PlA6T8vuklP69nLZFE/h2wj1J5/VWQrQ34 o/U7hcgSTHu/G355abnYwTGW25hkwQIQi4xiaGpM5T+SIEJYQ/e1M9D84fqiHmY7N5LwOY5g 1DAxE0d9jsNAQhDrkTMexcyyrzkMK2qlHnSjB3gtbz/8fbcxVPo/Lx2eTM533nhZ65PVLNGP lqgOtVe0Ucgp4EZoEhKtCd0jS7E3XnsrRje6lAatDs3k9U8LasYpYp2T+dSKwstHzvG5zgvD WsuNuN2azSfyfi/X518vs4xfjtUGcseVkmsQkWjDfCILtccTCEzJB3Zcz1xKrXHt9Puhf+l8 ygoqofrTbTHLR/WoqpKE1XJNG+bx+3Pb0FymDDi+4+mHXHFTKCIhyKNyPwz2i3Q+EzoW15Lc K17fb1GTjPOeuvCt6xfZYaHQSGBPGXmREXThTqoIUILePL+pU/wcOwbwydNanLU6tRba6o8V NArjLoZ74KTkDZVNvCiDlbR+FWkzKjv1+mWiYDjFyZhjEgkg0qLuQEhNm10T5my1xqJ97u9t 6ut6SvZOmsM+cUvLOnnT3pkwlydaLwL9GvrmfHL6fLwrkTiRTjeRDkvWz7f7IHXGBRVS5zCJ hIJzCcx17LjDCpH5ft8De7pnZJCRyh6RChMC0f23ISZIPbxEefxP7F1d6gmB1D/NE4vceEFG EkyYXj8wbydxMyRkQ6PpWFYQ8+QKyLOk92PDEguZUKS17LO2fzmbxRR77iHE4HdBnKcvW6Y8 f1GUXaYKEETeaYHEgQ0kCQNx0Hc+jiXlj5h4+eHCRH7jyw7fnqFCtvLuel45ckBioKkNGUJ/ XnNxHqJUn9nqnNIJBjAPxiA4CoKPaFEsJjXcu4TGCVcXW3Je4YoOAYpQ2jZB3+eKriwmIQ/O FhEiJlJM2nZ6K+UpWZiLXlBzVRp81kZaGEDHnU9Lga1as3TCFX/e7oGvLn8hfXq04eV4oW03 lnyisfifhM6857G1fMfkk+SHSyOz9Zg48OQXMTGlXh9727PFdd6SF/rz04p/hyob56uJ0TB4 T7kNvgwd/P978vaH1H6PIMYQAhIsAjEYQSacDM5efy5vOcIyY72Qxr4eTEulMdchFLR9llK1 UjxvXXgy4btJl5cnSX226S8FFDdQu+RK9Lcg4myX1SD6vqGB1nmPyKL5so+Ndc5ECh2a8ok5 P8+UanUnhZMT64s/Ow8k3ekfUtrO3HxaPEjr7nA/d24X4bispIis1MqoXrOflXC75+jiUQ1w 6zjb+cj/QKs1iIjxX5z5YGdUTY41Xa7ao1Kq9P+27xxuV9WeFitRn0PnjDv2vrBzJhQXEeS3 BZZ/EsVKVz8H84pS7zke3ezg5dGuXTVOc4SjqrlGJoWgz/n32yKl67Ywimo74Jvi454a1JVP mh/srl3bpERV73hXqt80bvbQolFx3eOiqq9ZZ6jPuX1d35TZt3f++sr5Kl2n/Ed8qYh/xKO+ It7iFUizSh/NmxQ0FLUmcQmhuVcSnF2lVQ/di9FNPywrRHj9DLvy7a+zu+Ymv7aHxiEILJFI SslEUKVCI1H0wUBBP0Aj+C6FBP24VR3HbKmAUQNJQBKUsKpSpEEE0gkyAyElKAESgwQpkfJF OvRh/Vzgeq0cmB8cJuF7Md1oOIqPiCABtgA+yA+389Jjx0n7XH2TfkAd3kAOh0FOEHZAANYO 2I6TxSjSioeT8tCGkTLlovogD9HvUi+MFYnd5Oj9Fc/s43Q8vCAuVzk3PzoEt6qQaE2DdI5m 6E55os/BBcn4uJ8vNCVnDyxt/lSi0lkUs/9d/rw+TCo5/RtmQX9fR49Etdln2d3KBo+77aeE m/VvLK1fFwmH13m/18sL4W82LyG5ssYRTF9mJCCbB4wIiAgLn24wXi+dQ7ENPv788WZkYbID B6JTpDMcQTpSMfrkQ/JfgkDz8c55w8VtlMUWoSHFx0w4fbCzd0YDBusJfU8Qsvnk+OktqACl 7evYHTYcwXvYjuTy46aKAioqSqN5IKidyoQICAef7MTZAFIB90C+HmwHOcAJIOIRBRPul8GQ zMQYDDn+716Ph836e1UpSdMpEJed4vSqZCYpSrrlLOIjFvnMZU6zyx3P6PyZjuASwwDTfHHJ xz2mXeC7sUqi5wwe7zdvU7f04B2ZR4DzDzbhuOu13d2WdHc7tptGjWTJmVngwcM6ZMJCQkJE RVEQVVVSERREVFF/o+T8+fe5+R/OB/Pw+ymY/T91192Bg/sXipyh1KoZq+DZ8+dVyZmmuf4N eHmw4aCQCKiKJAIiqkD4swoomKqAoZqWJqqpiJpqaohgiogghqiiirKqsF7G7925VAZIjII/ z7SCK9XICRJiaIDCUcIpgmFkIvD6ESdmtiQUxVUVNVVNVVRMRRVE1EUVUf7MBiu36vlDYHSH QHQHO3OHSVsAcvHftS38/Pxj/WDpHTVm5OjZ48WGMCy+1o4XjsEa98PRzdvHjJuhncVIwYAS AE15GBzJkcbwKBMkOJANQKfXKDfImgatiNWdaYHHuJbYsssSw0nNCC5VyrEZdfLblfX5hHyo K1ur2ttQw6LjWeGgxP+nD+2HR1bj4LUb+3t7vv548815T9DRvOjVCmzeuiyZOEC1olN4xYqy sZ+RyB6nD/LCaJKGCSmkCloYur8/8nwfJ+X4MenrPpq/FNZc2eP07ORP8QZ8SyQiRCMWJ0RH CEiSmJGCEiipSCSCRCIJYYlaRKiaaYiilpKKYkoKamQoAqmlqqSgkoiIkhSAIIihiEIogCJQ gg/LC5FLQvq+yQ7vn+95nTdGmesJuzM3q+PAgfQmYAdDMEwto2+vf8DzvthnDlO+zRWpjYAw UZGrciAzIQNZpq8PF4iU6CFoCo/PmJKxARESnrsJKaRKSKiIViUKbD5PX7Pv/R9F+m7unUDs 7ZKBPmlR3ac1/JHZfW0dOCjUk8OljYSQCCYECSZhSHs+84GBJGb114m7vTvNvKA+ZM1BC+9V 0Vh4P4nIw4fORH895hkDb/GpUDSyIIemrbJ9WKOstWkoftdoM/Z8b9jFVmFF0Mhuji4FdbIH Yc9VtgaTadCDYP5PRL/qR3a+hFgYIsQ7atTQ01XP9XRPrfT+UdyRAsPtf6YhMgAMR6H2/apT z3Co9D8d6zESySWnjVMAI1bBesP78nCqhMq/Jh8n+Hli4aINoTMfD4nkdDD8Y/CfRF+j4Sqp KHyU4Dz1L5yLzQSJu8h/thC8PYjoTdIhuVLp9T+NnUXTJCmOiMDvlye3ZpH34ng8xRDWh64f CJ7x0IRfZ90le/5lDM7/TNbKPzYEOQAtwSX+xbWMJaXd2BLYURvx4/OTddaF/c38N4+PJtj+ Drz+w9APpbJC+ABpJ5QLo870zKCH9BCeknH7i0rerX9y+NzG2ZJNvRj1/d3tP8JrP4XmxZJA k5VqfCb/j9Lh52/gsm9rs7FrO3xyTtjIPrEQTb0URpVt1zKyejjqtIfdZGf8I90tQxNs0tTN 4xN847iZQ/nzt+b+Tb5HLtVutB1tr7mfqvgYoA1hkirIxFcHG7BTRJJ2+N7E08id1TmZw1ww E3VceK0dI4Qdt637t12utDrpuwmF9Ubx0dOGLys1E/f70EXt6ZHi/43tpSz39u7/cqq9Pj27 /Dyp+Rhw2b1pE0CP5ckS5GLjOPXzO4uvydgUoXTS+bM9xOuAOosMnu3wZ4OEsWKxYicY0RhH 1lFabc/6+cnDdddsq9CGb3dTjeqOmDNWhskRR/rTW++RvOUzYs2iE8eyUh6SmJhGHZtDSDsv 2G9NEAipAILN9ALwv0+fyl6lvxe3UPJz7BPzPHD+DlHj/gZvUeu1rMX/xz6RpMeVEwRoEw6a CasIxIDjs8UaKyNvISpn87nv0mj4ony41cT0ktjkD2yMclnHTs7f05s/q/n+U4sYNlYGIS1I 9YjLDw06vXr5P6e771MsQsTd+86S7xZGWudONtqyuKKHsgb0iLhq3TOX50vrd5TGzoKf4vWm CwqOzsBR0yMxBm6mO4eMnGPh6gVNGB33UGapDbr4WxRR5rbCWcjyuFAgARGM5LYxT7SM2pF2 YqHm0KudOKPDj88SQ1mhw3iaLU71Av7dJaN51yRLCVAoIznMr46H1luCW7D48sdNZeMz1fLJ /eq8Zb/DjbXcz5CgD44Aa7B2a17HIx5Z3EKSIa7qTPV+H9vU4ZSQr8X56Olhw94jEHx5kFLN RE0fyv5nNI0L7fI2+jjoSUp46sygeIjLYsbaSyUoWgWiwa2jX+O+pzmqT93sej197vx4r97X obHlotb+9BuppxOTl4bmvEebbc+0/lPXOyy+/evzp/J652L7e1eMOWmS2vNwU88y9J1rOnOV HWl+i8OFWo22vVOo8Sq7InNgVYhpZU5dxBeYp+j91Mp0i9GzUq25yxQSIblAU1k/eP1GNTtE 1dGSGLkpt+ka8LRIUR5vqecopVrxDHR9lOc6cLbVLXaLMC/h+s/cT6D8Zz3v2f2mTtPhFMVP c2ZMxaAgcTOCoUqDndm6av1w8/HokYZ5wDz+iMl+9S+xTN0/uzFkuWu/zc7V8993jJNc+zk3 yhjpxui0pfVGNnJDPnJ4ZeffdaqgrV6B9eEtqf31y7s9R+dfohjaOTnUiP8Yfn2SkudAnx5b PLHRpr2b1rzGJ6pOrleVhae+D2XwHwnDZuLY34TiqQLHlM5aTjxd9lUQzllP1yh1c0PojpOP Xu78L8cs/qvDm1P9k245PqrLtDvLt9XN6eSMdOEIvRlujcXkTrzNGGaDvR8FTlVIRqjCqKXV ri7CquB818NMeEfNIhERROmu8UIH8mVp5erjrppuNGh7Fysj1I1TxCpuuvM9IvB364xI3fpz 9PQ1kObBcejT0RfkxwukY4hok/RBwgsVKPMqs3fZa1SsspDR6pirFxvnTTZz18b9jynXXht9 EbPLrPJfubpX8Xi34h3x1QkUcwoXC88cenmtDLGQJb9qw+2X85DAHwU/B2HoUQvR+YZtduDf CIEWGbg+j4mvGmYhIlZ+Of1MxQYtoHUpKbs5kfj+HMSC7nW6J8jBdH9uY8qgGEsGYrBkBZ+w cD8UXMgTZqDWGdm2nzXxaOk1MeWFSbwbnKv6fBmDvX89R937kfl+p5FNtlE2oUoaWesTMuGX eFDjrMwtabxuJatY90vm96/t97y8V3jo5uFWWlnSdeqXl82qmPVu8O6p41Br364G6oeOubpv hvd5ut4d31pXHbU6pebZfA1zatOJ55LPG3jzePUr0XZUeJKkqF4c75OHp2omtBWpjz0PXFeH POOdJVzzBnE4VURBGHlJQ+FFLGliMDuXlPpTDOJIUnmoO8Hg8Hfb1NzZcmK0bknGi5p7nqaJ Q28rGxjqKchFewtrjznmh55HQQvOLAF+44m+A5yZ0QOE1AmVUz6R5dLWS8FAe/cotJoVyYAy wKvpmau6zLlJ1XKRZDVwhvtJDY5IEEhdvzbjs9EqyrSymYsWEY/ewcwKHJ9R7dvxp9lNv++A iXgKUVejhb5vH4/L0eHjjGMY+Pn56qqUpSlKyqquuuuuuvU1wNczBihkJhQ/wsOMFAUgfCIJ 9Hz9D/OB/sHtqqxQzKqsU98/sPwv0fr/83D/gPf+cSO+pzMD4COBvf8JvtoQJ6XaEoII0nOK m7OPxg4w7uIb+Si0MLzRhMPLGJzxxjLQTU4X8tPaPoXP8es4fWYRrvqVriVhCl/hNMrknGth GYlfpK/Bu2XxyVHyn393yLAZgt6YHtLQ/KIeawvCo8nd+jFKrffRvL+nrofxZn+/fZrtvLWt jumQIYRKScsRMwOyhEwsAioKeqjfLEcfP89HHG3i67Of2ih8HrO/maUKAJggZpLnAJsqRQIE IGLCSfZaVsgR7TzO33oHlLzfwcGiiTBn6zQsf277DpMcrx9mXEvmwL9/uh99nW4V6c6f09nC IHEDSHbcSisf09/9HHQ/LdAB12lCEuVOZVMol5wZvCKnHp+blgX1++oaq7K0gHrJDtGobu3f cc45XiisQdjFIG8byWBo/It5Qpn63k+ZQQrWCLxd3SeCKxId3B73HUkzD6RxFBgG6UHwQNO7 Gy86bpkzux8XxyxZphjdRpSYaRrEb7A7lWUtWtyvEuLDHEravKeemuzoNedGuRgvzmZGPl4N nso8Lvvo1PpDD6PhJ3+gKz2TLskmtBZXXX0q9lJVj6cY/2rK7zdDMBnaXJbn0zC6QQkkmCK9 TszHsLOS7PP2fAzlsvhbXDNzYKH4CCiXKrfMODBubz3jFyDinLpwbTewGqHAzj8Ph8/xfRCD qNRL5HUXdOvpjKHyzqhKFIQSTVTlOHS8xczDu+VqYl7qIiG0oWs5d9XFCATE5aFWtZw+M4xm NYl3FUlWpI1MzSzKytVGVRqpzF4lcE9MMAC5uQY3McDTRmAYRqmSQ3dwNxFpstcWOQtZgGKu cpQ7X3E+tzk5rLGLuT32RY/V5KymP3ZU1GiyE2pBp5C6ZY5VEkgyrK4gdg5rntrY+O/wxfuX ODJwwH7NV2M0zQztaePIUR24Pz/IsNmHAfK3iV2kqMZ029s7RzlcrRqX3hbIyziohHlz5sx9 22g4TvvqV+TaxncdKw42chmjA/m8D60PE4pQFQD9nvu2h8OevZYfn6vzqP0NSbzazXgepuVa kl2g4bFeh/JgA6QK3oAuLIjCAMIoe7x+eg0IuTS0PeKKgy8eown+RKhFjb8f48/csKQqcbWf ldlouLpjZvcqy3zXFpLAk4xVd779GQVb6gZrBGp2stoiT53klFbtsiSJDg6Zk3EbBVp0lqO4 87WBadefRB1hAoBDVlEhogRawFpzy9F+tgrNPhN0hzkGcviarFYV5ZE7qNcFcTLGrM0hE+tL KFSXo1mVdZ7/EChx+yyEztPLFJ9+equl6RjGmys5PJLyxCvxE8XBWvhrMay77giD1EM2Fm9/ yM/CYg9PwQen3XMuN0jHbzcnxxJ7fdZa43gTaIemd+PEaaXTEmcdwT9My4CFWRZRtlJmhJJJ Y6SyktOsJSZG9NWMCAyVmEAyQ9pNtBsnGei57saxATsZiptUH+NzcnbQ7EsSTaq2srxcWRXq qJkwwMJ4keViGxujkJWyYU05vyhBkGct1X3s9VdJMbLhjQwWjeg1u3HsQ78lDGZEMqHD09H7 TryOqs7nsopWXNiUaT2afdcUW+smKrNZYqZ+EFQ2yI1xYrSqEbITNM9ZP90cUQowrGxNrbSx t9t1Y+1YlpbWz2tdqM0bQcFudlDUAhi4B2Y2y6mGsxjE3byuRoqjY/M3ibp2Nr1yeQkZ02O0 ZKNto9zcjM2UkkISU/rL7+NxyGmHxn2/GOlFj2oQ+OoZgO6GYv204C619mE/iB+4vQ8ZN+F3 XbnFtxO/v015id+RGCcVJ4FLww1MwZX1otd8Mz3mowt/uDsfZ+G+GawMqMxFByfI3f8z9iUA c+VLllCL9MO6L0ReoL5k0PmHlzCqB4UXdVnM4a0XD6UA5gg0J3mZd8qsTVotFw7rQnmrUpYp reMkOYTouJuSJjJrL2+dRVJ7VZy8j4wReEYhxYHuHVPlGlStXlh6d9JknlVBa0pRCZrRlNd6 3h79Ul1L4pnqvrRFfSvXdind9Z9U6rzOM5WZtIuoiLclNL5mQtMMO7wjWcRh9Zzq5UXL1IQs PFXRKFkmGjCvE6l7d9fOM38i/X5XILJ+Y/j7tSb3osO2JxLCqzl3T8/U7C9D/wvA7fNX68W8 0zGt0z/GdUEhQy8AtUPS2hCagvuEUqHZQZdb5xe3hncF1tyfP8Cif33r4/dqX4Wvuq4JSe2p axYU9/DUZz1JKDAH4Ex0Jk1twQLmM5LeIvDRrtjxgH2oru3mxiwkMlmd2ZmSQkmSEJDIEJj5 0DshnwqLQl8WbZFnXs82en9v6rqXSBD9AIemH6umdc9NnJqe5WeWhKHwPQnAao8jETUtV7sU pnvMZFaAjRhpLIkRKy8hix5eyDbEddQQq+Of4+2RPi/BTzflAwD0ZDuoPJ/mJ+F9dQchc1vj nmFsOZdG0izGVnxsVdr3ptuuZMy99LvE/V42cJNw+6uDRJCYWrM17AE+gZghQh0Nh82Hu6zn MHfxQOvhJiAmF9vRcc308/iIlDu8HJoGOHiBp6pMdhujuOdCR6AwqC2Ljb7Gj2hGWo52M2Hy eh3pTEsZjkR5Y2oO0xH7yZ1w6zpc55HGvp3RCzz97fXt2cu6vlL0thamS1HXnI6bD0BNyiNo kLOA2Kb7U1rHnZBP00/Kc9mSBweEeoONSUiWJQ0MGyPx3tkOD36p4NXrrKHhnDc381KT5pTi VVHnI313jhgH4N4CkU10ezH/PsMJczkcmcO1mFWTATGkd4D2uCdKxAVKP34tMU0UmYYUVQUR BERFC+c2xDvbLFBYqgoKKIxYoiUFBSRJEhVAxFVSUrE8LxePx7RHQd5AbkMj6KP5jMdgH0l1 Od+LspwXSWwY9UQHSDUiXVFi28eRyDu2Ds0eBZX4J1r8o/I0p9GhWKqr9CpRUQt7w1VWpWsn c+8WJGqq3dE+rPynyfGfs0RT45iuoscHtE15x6cNdAaYthsnAiJXJMksREU1UMBBMxUy54me AUqKbnJzigQmjgGpNiYfE2wR4xvhGxtir7u304mIeIMvob9XQNqZjdR9P1WG8kIdwyM/pA2x aCW5KagusdTkJoQoQ7XvJ8NPzw5rr4l0ofPQEMjWMI0yP7gYowfbqrsJyVff0fr1ZpToNGJm 0fslzHrY3/ReWwy1e2vzJsB40Efpw3aohD4e841ULdcBNCcvOwRIyGuWuwNCPBttHHs0qRJB PXUo0kAkuLstDQNJAZC/mIcYooIC5RJhJjSPLprgQTafieiTC1/vz7MJFbRu2dhNtDYpxISG ghDsk7M4W8nN5pzJju6SY5rnw22Ih5BFILEPp/Va4A7AJSbIrM7NQ6naCYSJfRaTFsLlsy4f 0dxshe8PgiMGlyzh/dAcTJxBCCraotOi52JGJxXFjLUSLH5CdsrJFhstNd7B4bOtjsfWASet u9zPW35yYizdJj3rREqnjyAxGJ4rZl8tmpAwQRpLZehTkGBcEe4Q1nsE8lMx9AL0Af1BCgCg J984FREQRRpSiUZbEVEREWCVPzHvnWPh8X0b6mph4KdoYNmiVUhNQaIWshCQkHoZ62MjT3xA 5ggVELpYaScrtewgQLtigQM+ey8NWdkyaiXg7OhISD362Zp6zLMNQOiDmkXRsPfr29p1Lv2U rR0IMBj9rMAxw4RCZu3t0Eahmk5oOwy5NmbXES26q/T+vW7of2Nur360vNS7C6ea836YZ14I SXOHybAoj5K5Q8VV+YNKchpREH0SCGl0YXx1BDULOh5w58/N4Nw01JHBajpygxNGR8MM5ukZ siich0DCDDALjAizYLbkQMI/qtu9MpZNLMcxqyu2dBYz8nVy3MwMykzlDZBt+HMGRZndCTJM qlQLxNRsB24Gxz4fg7fYehKNs4QDmRJFsrd21z/dVT67o5Icx7uvs+18jP6L7CePUc+PZKa0 VPBCbaXBewtEgS5lHsBg63BiGxypn4lkkQudoLkXo7H9VJwfkaxgIlQmytf6Dn1fE4M0LDsI y+jKUQPAxirSebn67fswgskWMmy1DXy3zf08nnVHjmlEhB3sdnrR7NakaMX0izrKp27Y+pVZ OS+41Mzs31LRA6TUo+d4wLITJkot1D6XU8btxy9Mq90vN3ATPKVH8FPtxvBZY4lCCRzO4SZN wvHYaxuGrRE8amYAWdaChKiPN05TGPDdVz+auDNEQCc/UXG/iB6ncyPzrZg2IvEFqbzQPG2Y 8MWnVyPRo1poNyHi8YG9hEenyMujeLkMreeo2cv0+zzbFwrIKxR8T2N2dgwGZCIbtmspVJ1v 1MwQaFDQwaV6yLBiInmtIFGiVt6Iwas4f8YMbfvp6CqiN/NyRNafV3OrcYtsaJj8YQLtbBYj ODLRSyhejBkZ0EBhY8BGJoEBkB4O7J06ZCOj0rSMeBpGJG+0qK7fycI8fr1RtW8RIK7XRURN 3ZfDrtsVqf05Q5r+joiY1wj6eqfRr6TGHQdeRnzaq+85uO7DdWakOh95ardmcN2wsIGizJJC Fcu2/EDGMltOAek/irUkkwkml4f4WY5GgOyCoxhxHOEtkASBZx9XO7bYtq5iPlsKccxDiKmM SGlgbc6tlLz8DrOWNuvz48zPcm7+MV8tiuiRf4P1fYhQfszIQlEmXL5wWvCv7mY5DJjQ3PVY ng7wWboXW+wE1nUDiGYPoCMxCY0dntkfOjWozlGtYI3hsa/BbUSYyWhGi2s2PuuxE6cPms7Q Y4Y47e46b53dfA4zO7k9/9tVERFQwzMREQMDERFVUkkEEEEzIyMIQIEISRjJny789enbqeN0 K06tDHjDbSAZniUbjFi4mPNiBkmREeDFKdtrkI6xRos/qEWGj7OG6CG7e/WQYUXD0137R8yC 8yMIWaA+uIS1lS0x+fo5oYZNaOJjSNk1VqP3o3iRzFfpPaUtKDsjgoXIgq1CRI1ReDjjtYj7 v4uS2Gm1Fus1zbSR5fcxvh/hH3q4i+d0xcYZkTXaIvJDnt68WiLxD3D11PCzYURc5pJlxFHv MSauDVbffoD0lUrJPrujjE9E4YyfNMkSB4yF0x15E+UXVm8Rgab5nDOfe1RyfyZ/zawa7ydp bdpPEfX7rSXz5ELjmPRCZc+HX1Ecemw54Ds+vAeWTo5DC+RJJB+L8j8Y3mTaldT5cZ+pUtXh 1DXh9U5CRMWVHIovtI56pwPrrgwEMDEhDHLzzifny1+PA8tCokCEImvE9SWuplybHjkyxLg3 jFJVR01KFFpPYf1cujqtO3AWMr7Wlq2ZRm5kwDvx9VkJSsg7JUB0RlYxyQ71jAH+rKxNhFvG Gi6wu+WDd9I6WzF1VM1C925SqTE5l0WDOfFzqiSEI84tSH7RnGMIwgYGIjCywkkgwIIJxkcR jCDAiKmbOwPE8Fx7HKzIPqNHlvI1UFl50JdWs0IqqcIOPAHSUYKf3VDd7ufZUMJsSVMs5xUK iUiHZq1aEF8DskhOEXhviRhVyv8m3fArwHdvwR414Fg5hm5PAiY5pK49N7dfo8tZ8KZU13I9 JTok7PUckIlwEfheZpT+LoP0v3yDbTOsOU1+M5CbT9mvtsswN8a6kD67x05wkdt7+Md2a6+o zKLERMcx6H7qkVCZjUemxpovrsprNjhklGdsYFO70SgGybtKIQKddRGIk+ULY6+Bm05x6A5J ArXYRYaC6OhFdXR5AxqsE1U8dV4m9GF+9+xmXNoMJtcI/TdLB+3NTSm3Vie6GqIViMSTs01h AKkjrAhAPGHWzP24UC4geUTJXYvjnX9fHl+clWKm12zYJGUNHKaSDVCnxcoQyOE6uFhU3xMA hOn6Fr5yD8fbum1qNLuJJZH841YGaVKTHq0Sk6cicb030okFgmLbTMVEv4T46yR+r7Xh2OIV xAf6SYEOOHX8R9PpjPrBHlnt7D7Wf3prX29SXVwGRMUR7hEKrS23CtrT3bCqVf7lOJaStRfZ dNE6ukESb8APEd9ZcmKmGZczsJ6bZVyk5xcY9ZCo7R29S+RrNpALkFvahyNfeDB3mqqM5bN8 ByJkG9qHBQ60vjOWMsC56zna77cOaOnSYZmlfWHM+ghcISsm4vptDrEddm3WwBovcuMuIzX8 60GMWDOc3rnWJhWe1at631+ewK5xcyWc9ZBrGjRYwP3potzQl7abtrkJnN6m9aDG2x2TD9rC YvGQitH3KhM5FdHL6+2Qb4jVQ5RMfHPuOwh0nQi8lv3UzP0pvUlxT2g0MIS4Jr2/f9kJ6sIL srhBFlYogwZMVEZNf1PJ+4QJdUbJO3v2keS+UyNwXYFC7FpFUzQciYWtiahASAcDZa1Ki9y0 RcJKvQyNemJLB289D935RhG2N5I8QtsHncd8o463ofAZBh5ImeGzpi+3VByqFusCsWv3T+9u izcHfZNL6TsRjDOxgj9mP2z4kVWFyLCEIRB8PsdNoEiTENhi0yprfi4xh03U9OGP65XhNdBZ lxMJOeAhnRaPCY2JpmyG+EuY/XuBB8jsQlmPGQImaNbp9x+VX0PhIikf2aGG9MHBCZHQ4xBM 5QKFkzCj8f0+LJATHjqqqqcKp+k47u3ccZzFyEjDjMRNfF5bP85PjWswwDGqApMQsuinHZ1U 4L1gB5fCTsNMAeNIbWCvnzpzqh33DVOuXnx3ZTp1rild5Vureutw4fKNR4L3b1wquxefyPa7 PzOjEt9CbWtUY4p7IkpLNOAyuTGC4zg19v0PKtJKqYgs1frL22BGcAAQxtzYaBM+HPdObVFr MjZh7WhyL8pBpgdyyqd7vIV3Nor1kOcgQhUlnRHgwGtkMzFxEUyRtU+qNkMILed/9kJpNRW/ wcY3KCTS7lbaGdCPKxAJ3CMCWuIZqvfZxKz8UaZ1czNgaCPvGpcbjIvshL4nimK8bDTf+L9N L4A+ooeSSz6jF1R5n39vmhhn3VsKJhC+EbdHOqEh2cia9cljgWx/LMTP1s2uAoc/g6I6WK8b 7H6LR4IK1p8JMtezaKjsemfs1e2WeBZy+Jtt1GfEnj0PeX1HNW+NX7NBWQJI0nLbJg1piaGm 3kajedx2Ppi5CzxU+bs5IdRm0vtgwGOzF/W443pxUPu23xy9zteq9/bmXuxrpmfYqn59Mdzi 1EM4hmCCsZ3x9eN5FU+cuKt8hBZ8IZAwetnsRmmVpnnh5cCFD0Ivs3kBehnrtbBbcDTiUOpF ohAvxqO2DPz4fUiEdD1OQgvR/Zn4y/VUbTGus8W6GcUgVYtA5zgwT0WxpGEFDywIXV5qbDYx HfTrrHwYQ1XPZDEGBSwRSwyzHEoZ+ghNqvNM68XB9hMvvGDto+5Nv3vHNUW/c0XUH04kI4R2 EUktcsUQtZ9z/CrZB2kV1F20RP2p01qME0K6Pwb0KBbN2GDfPx6dlXsUi80w48W1y2H3ffpq 5rU2dGe4sNAh2P4vhVrKorenj7nl9TcZuj2ktEcmCW7Ke/fGChz2PBeqGwj5xelN1V11t20J 2EyBcdGBBp6qyDEw+sxrIyo2LLkjrwqWjno12zFzvoWRL7cAYOeEeznoY6sjSQ3pgD4ibOQq jFkWynu/c9yvETsnurTaQCZQNLJDQlUUYkKqqWoVOMb/kpam/HqeOPs7fThv3Kb8PO5AXCw1 l6Srqtga7dbV83dqnerK68LpOfjOmXgejGqokww+dzmLjb/MjJman0QDUyMBPLMLVKSM5rwj CfjqjHYX0+DZWR/Yrlk+KHhAsiF3/X7D+hSKsjUkntg+SzwLNrNJEPCaP4yIH5jgKEeYho+w yxZ+B8a0Jv1VkWimj1w7e0dDuyW/DmfJ6FSHqlAg7GA5UGkGCkMzzOqhSF342PoKqeb+/Meo vMquEWgOOQM9uPyV1Zb7AaFWg4EEPA8q0aPC8PJc1Wei4/unn0bhCTag0mhBg8JWYPE0pc5C lmkG/Fo0nGFTNEUQTHcPciHXs8Le9+Hf2Tci+eC4O68oe0J/lP2VPlbcs7y0e7sWpnDnX4+7 JEypplsEsTx7oamb2NU2RlGSDbmC3KVhu+hCbtPDqFMszabXWiVzmXXmg02/WxUfhWOMwVHE 76vqEFcikNjwKpEoITKOd5Gs/TLL6tthEuEIu5r3tKVGBsWe9oI16hm4NsB0+J7QJPyYeUww LNSQpCo1abI1nOjUT+TC2wQq/x72jRV9hUXSWYxf6ajMoEwCFnTkbR6Anh0aDQwbWt0CkC6Q MJ2PVaRMuidZ3/uKsCd9pSQ4RCxkV/LVfSf1pq1W1ZrKQa2/o6qMyYlhdGFOAMDmfxuxZWw7 Zdpj6zMIv5SJ1lXaJr2zUgMcEUl+OhmiSnbwicxmhQaxL7niQG72FvLK9qGHIvqiXn5RuOsW JIrOezBEKk30D+ZaVKuoLzs3PdlWXWSBGwsKqlwkTE1FdXyxLJDUsDmolZyOfHnjkfGH0x/O ya9PG7o/eWk2vo1HYkPDVHAsKdsSlZWW6HNMjFqfIOPqiFQ5g5Fc8WYJwE9fNDQjrIzsIC+4 UsxzzI5nqgeGBBfEvLXrIV5Czl8By7j4PUcrEGcqxOFZbKppEFNedlaNsM1HKvCwsJfT+XTL wkfSILsVg04u/jeFqTvhCcW9njI1kScGmFK5OQIwNtTt1KdHjq4KHXIr6HgfmS3rFVYG28Em 88ZkLnOjNIuSQ7sg6XIhYp+aFdw+I4hcHI+M5/QllDE71PohmtbOzQ+i/CO8VtBnM9uga9R7 I9xIoYsB8zDhynopAk/lZ9ncEu1vG2zwo0g6dfLzhb4BdhdunyRt2dXTAtc+s0k46Lcs94iz MI5YFSwo+6M4V/E07mINuIfRTkt/jlMvutOFWRWKx4OOggmZ1vhKZ2QpouIMcSxmGaL3t0YI aQapRwqxc+LjlZf5DmqfQ0Wr0KSz7h5i8nkPFGTWV0a/M+tGrXmY/EMiBL7dtvK/eMP7745C LvwPJSRsloJ70SQCdGvXDUXQ7UECvfkm9lJwdWqQ7sJlvkPL6J6NuKvjS3+WfM0T1jazOvos LuouHNjh1Ljrd3+QPlNctG1ppkISFKxBV756e/h9NYx08b2W8tyn27PLg2XZvrBInQ+HM0R1 yR/OWKOcfT5XBwYTAh4B47z9p9FTbWlCHlQg6iZoqf20ev7r3lxgzw57tfaOCe0ehR4/P/Ux IYdov4xz2PPGTk45zIlWYBb04IqhRCO4lH4hWhTlqY5NwmghR+JMOMITasetc5WEch47KPm+ dCQ+aneZvIvli6xaiRoiWfqxhLGTQIbXbARZY7X7uywkKhuhYVGYd2S0Ai3JVKUR1fRYlYpj zPmPr2YRkImUxEWwoXPGNXwMIeE3tdyiO+qNAzrykGwbncnmFG2LFghO1SCAoGQd8tZvTU8L BuSAauD7uG0PDh7a9K7dFcN1vaxXLXPoNFCep/HO1iO6z4gYKBJdAmuzndSG2lsNDVYaiRKM VzDvUHkdScs63Jc2aydIh2kUeSsqoW3Z9OYvQnw6yBU3N9ZO7z5HUBpLX5Iaz2e2NHI+/w6z xJ9XLfe5HSINJqqiqNodMap4105IQuRnfSMbc2sns/oDBgDBABguQ9awiV+bfuhTn3OHFFSK aWlAtDPB7zFPmpDvvrhqLuKoO3POgsBwYoiDCXwQPiIditk2xwcMcb4ht3Q2ePYVvW1X5UcI 2DwvdqaryDYGY44HcfA5SwXE9NkEa5OhJLDxNmiHCT6tDooa884EjoJhcqqWZq6ph45GFbT1 uTzb3GCsVgMFGwyOed0uNsWDosOYmOwGaRaQKA5Cs4YplVwvqyZVGqTF2rQ9c9C6IUnimxm+ 1bpdsx7YNoKVWTny30rKUn2fuw3lunThW8JlasDwV805PXvLZlk0saIg8L92p6MBsoYuScIA hG5fV+5MLz5isWNh1jdOLcZxnGyKTblIRNeVe/i9RWRkXwIyxmn9xiSjOXfzSg0uvZ0beFGT xnorrO88uBXGj9umHHGlOdSSd8SPg1ZzlhpodsvQkvDUTvKMAajDB+DGziZgynD39EGtvZ+b Ai/zqNlgsWhRMYfQcu+RJqghdzvC2+5HBDsx1VQI8S18XekS6Vqm0qY93nwrk9Wm/x27qMZm APF0owz5iVxhI54ckRuvTFuSbnNPOjliZCnEgHjD+AIjlJm0p5uZpQyiKss2dn+BZSq0w7Qs NqgSNCtbJnxsXUPTQSfTYaJsXCra7uxryrKMmSREHtgyKeGXr1T1474FcoVehnw4vOQ4eZFd KZ72ALfHZINibOIkZ02F9WeD1kBGan7VCL5itrm8b9poxMeHSVMdEKi65/HWQRXq1zUnfeb7 BoNHHOp4Ovu7sXmvv6yDifhOjp5PmeBn5KSsY57M1E6E2lNXEfkQ4anL6okKaMgmm2uEzbzV YfPsrKjuHMp0NxanOh8ZthKf305ywzLqRnRLKojpLKkOYNpckRYDun0w2ZNEz24EHLHHD5ra y+ydI0ZCQibk7BzTUR7mNzZotaUtepQdx7fLuNoeSN85McaQyNBZS9yOq7KUS54p4zr84Akz DcUDDOE/ppsODPWJCH4W1EmimuKGOrXmqhGqBadUlszSlJRzvafdvl69DxBxz3PTxIWkkxQP 599LkMDOqe937TQTcUq4ZaC+Gyu58/o/e7x+CqTV2E62cQeROT81KRXAgTRG9OZEHnEzYt+v PQimOqbPM0V+Q23wz1pXwLFWt31kTRW96OzqsrMvHneA7CNOHkujemnrMGZj7etz8+Sj5cHm msEw/ZJwkklvbVE8ZJ+5iu0uUL2FYUEs7aFcx6eKA5yWfEyLPbJzfYnZ93VJq9McGVNQu1u1 YrEJ9uqnW7MXTurOfhE4wWDdxc/Aftk6EusxxwUdeXQMFm2bwVgIYKHbwqbYGJ38ils726nb 34NNuVT4oBfbwQNYeERJfQLdHdZON7jtw/F4dGyCNwyBM47CaBoYy3NhJlCmsplm+SgwVsfN oVmSx1k342PLmDdhk0mcmI6I4kyeycNiqXpuiSh89EQ3ThkIOR3o6xfC2xscy6ZKty6+ZS2b yz8cLPjPmYMiDLXny7SQZg7CMovfJGH5NHGXQMEWDA+yjZU+/ByQ29ZKWhti8s5Si1V3CBfZ G61onnRIzGBpuqruCyP4orRnjKtOY6MHJi8ZyV7ucG+2sbx58cehyFz7T07eWtSrfOjjLZow IEu3jtd5Tk5bGLox1zG8+DmIJbHMmydda5WKnGiKIIvzM9Oda1ZefEWK2YKOxszy9xp/LUTP eBYPSCwlpODIz5b4OQw3wybLc4NOapNWDzoxdP8ec1cJ01yvz8a8YK9rhzTZMeuyqKkG7Exg UrO7MXwHMm0l1+TPODQOM7RHbywslWZbZWTfRVozyzWPfWDrO+S/Y9qIaH2wuhJ6695i4TVf yYlTVRouv9h3emX7/uKF2RLD0fDrac29xppEiTzP2zHXLphIoHu7SXUNCtM+mTh9juac+nWd k237LOTA9O/vItMgyLJSIR+mbEtGF2uuRw7B9RaYVZ7rojjkv1YhPHJ+HHPtPG9U4V59uJnu 4frc1/Txs3psjLLyRrI848N9ZYdgo8zQOuLHYdE40RI5tx4RhE/NBnk1qZ4OlJD5QazlrZjr MyqmiKx9eNdxJuyHmMI32OD3AwTfR6TrR8vPJItKsfRBKVV15AHAhvT8f7PIN9LCKqNQ+bDr h5teWf0Vz/OXtG6n2wehRh0XNAcaoccATO4NJAtMIDWvr6CrkznpYDlN9b105Mp/KXFdYY5Y iHcsEtLZBf6cui4pIeEIMPmjpjKuY7SI5sa8a5Hjcc4VwvbqoRvoxHw7Iznbk+zDZK+4zE58 qjaIuPhm2y3dwdvUc8VBDi9a1BVSuRfhFxWK8m+1lx2Tz7ndG6NfcaSszq5an/VMvD/eVliT 4nRyabZuptANrkMHizQQ4tLAFt6a+LrWHyZsiIMCK+HieCkXFVe+vgIy8xYiraJjoNR0CSIf BCvgPNAWDrHhYbbBLlpUJkz8DG6qFnl+ORSv8NB8cAsm6Eb0dBPq6L5L9q5CuihYusi8IPmz jr9zymlGZSxxpayFIE0M4L1KLKL5HzyexUdHOillbIFEEe2ev5Pk/6Bkj2NoKMIaTTD+Qwps viT+rK30HJhjGx4eyvohjQ+IQj8viozZNIVPQfts7fDdmHB+q+z33uqP0PYv34vO0OZDIjco kPhFDzHvvsqS0hMgcG5uSI6Yii1DmcqY8xFX/R1y6tA58wechC+w+kgxfMfgsViCa7BN8Ggx oJm2DZEn6obm3RJ5H1lll8zI5G31bTW7fHUEGUIVubL20WJXTvn+uLss+YjzsaTPgYpnmDRw fFgHc9BTwGDg7TsP3vd9T3w4E5AUJ+/JhFU0NZFh0pTjoAVOFGsCQ8/CFXlf6WcYIsgFTkCV XZDkxD/VCZUjS0d0OEh55fDWAdl0tS0dvZOAcEbmkHcAZARINFA1XBZNC00Fbo3wxw5ZbJZR CTiBUgicanApQzM00Mw5OcCX4YeSgfy+c/VM5P1/THbH45lVuz75y81ISvLp1wBeZ2sEE1rb 9mQqdUGp+RmmKn4fQyneb3F59utQbIh8cYvPZUvi78j3bEjeIyrGt778/zCO7mHBCGDMX7dO AmRRQmd5iHrdmZpi/EGyBwki3/BQfH+jxh+XezAGpBw9HnGY+MZNFg5/QG8EDuHH+Y/EjQAU sSh7DxPA/kRqVGaohDJj+JvHTcmbxcfF0ca/oVbXclzfbLy8/wripnj7Hp9nzZqYTDPnkWR1 nQbP5RGPog8+bw7P3TIz3cZ7W/rPjdt/0k25q9Hq9l37fq4Uq975xv6pgPSmwF5pUSRn+YMo Zf2ffIh69jzZhq9wKp0IXPCLu9VLOpfQQLLI0YGLbSKt+vorwqZmKV3waF10DN768iu+6gWJ oNr+V9Fk5lK4cIsGhNFGARzIR58LEkbDtB+RFLr/C7wi/nIUzmG8vs7fJiZOKH68Oab3MXJj Bk6HQgZIBDsneqBIpKIjZaQQAGCwKg0v4S0yx/N9eSWhKThaKBNYoW8Hf8wca4k3ATA0lC3J 0PN2YfRnlVZ7N79cB96P2wdOMdzT/AIon/0lWlACBCVR8I8/tIwP9tA/Bwn7Px1WBZRWGZWN qqqjQKgwYK2xaMVYNoNKLIsRqW2UYqitsla0RgiICypRlKqrLBWoBRKyypZJIqVtpLbAttq0 CiQtoLI0bG0UC2jErJWQtsBZKypUitLCp+/AEn4UA/Utr+RFzFPxan2nmi80xFIkJ/CrMtJU hCEIgSRRBMQr4SPvwlAcoSAnqwVcZaFmRYE3fBf70gcQNxhmBRGAYgif9MF4cJ57/T9kLLHZ oUeL7xknlPdhxJHjYf6p9EiLEH8BRiDiPkUQ6gY2YBvBmD/J5P/LD+a8o5f2pkSQTePr0/mf 0oOqKkVhHnRbwcMvWFnyew6T5fbmfnlflfnIpmJFSHX9HhKDpNzeI47irjrtQOYeLslkeoKY G/RHYhv8gT97nxQHPFUNr2RSSSSSgHgE2gxBOUZLhBQEJS84Z2RRgHvYYsqZMeqbt/vcrGh5 jZmO3W27XJQSkwOTjUmikuHZYQOYLkYjEVoe842QAr6n+CmdrOmSY4O5UicByhIlOpj+foY6 f0/KPnE32/b2+JuxDCq/uOMPQW58S8IdMh+GD2jSsS0DBo4/zn3YT3Spyh60V/HRz6SiY9bZ 26OXeMMPd9Fw2oCfe9GGMj7fwE/2Gu/nyUIcb+dul+uvyVFdbiiOVRiLO1hYZH96flM6YW3u hiRnBIMa7iDNsRJfcOzci4b/mTw8V2aFY3yPfgeirxeWMNyHR22RhylYaVDa+9/OZszta1td c8jFMOiTIL0wQnSHEIA7EUM68gggPxgMSnX6oVGStYo0pRTAEA86tU+Z3EDu4CQ3/PF2IIA/ K7XlqLkiLwSh+uOjz+nsLQ+P5PnPE2T/4PPrRQNCLQgQfesiPiGc9hwe+uw3pM2WomiPOZtL 6rCA8juOdCW0rNzsheM57dkZEHdAIEQckBCMCAG19j37mzsRfWT5D95BFw2ECDQmEAiBABJA 5BgWGlO4jCCT49fQaeBTINBmLoFzCIRXXsjEQgQEQTpgcETTrVFNRF+9Zxx3aU+ufaJEXITF jG8Kg/COpDaURsfhCm53xK+dd0uPAIcjwCMEkg7h3kmHTSTwfoRgvcmkiPTODRSt8UBmmTL/ JzyaiYNZShSJJBJAuTvE2JmTcUXHa1RkyDIkc8NXicCoU+1DiGd8UzQVVElwhDsiaXQYPGG+ iM0A93/PJgtxRApaoDyFnZ2HAnPlnwg56USEmZEL4D8itTR+/neDc5tbxLtbq7Il3pXnXLCw bI33nPYlrJsZ03TPys5Ocq0Ny3s5D/idvQRulOArhxLrNfbX4u6ktBQhCUnIlFPdDxUfc1M+ mck8v7FpEcw7v+fFU6ulEzO3wgjb8LPus5pjubNNeWMkWZ1zz0QSn3TupOjx55xvrn6/ck81 0cZpGa+Manqs54jUajFWDw0wrbE1GTy6Tngx/mKnsqh6yFZAomb/xCCDTdnxNuuLZLv73d3z hUM/wTOtOQ2yAF4g+AmPKLixvp21/Y8tZl+6AdyZL6fZ0Q5v7X+lJTVieuMCF/7V/CcpfZx9 ceWT7uTWURP/GJ65z9pgpJ++LdUT80LWe0V1+BYljx7XNkv3cYfV42eTNXZ/c/9YaZQRUirr fWhCZ+//VZHowflR4c5/NGljLNYTpMKNoqCKIoMIboxg7rfEiLNwbkOd8CHr4SnKueZIidFZ JL29BD4v2c+6/s1no5uf/NHb4tzUTcRVLmObn5Q56ucovF/XZycq1X0e9/Pw6q+7z2+cFZ1Z dfy08i+NseObyr4rfWj+woeGNWRoP7seU8vpH1I5u5MD3/6DSem8N717q4+EMvXrehmk76aD ka8Gsodb/5npn15tBXeejonnfpKdUxE3WafN7uXp3+FW8y3SsOaXqr6eBxlddDUS88dvRfsM DhM2sMFzBdouL1mMmK0HpTaEfDB9cfNC5J+qRKaQKDtD9J6vOO/3/9PHrnGOvI2D/u+bV2eJ jY2nznJv0uPb+V3iM4Z4znKcyXn2l18svEG3TJr9en3Ad1e+swOhmBta6ueB4TPYZvP7/j+E 8P9P7XEqbDkYuv7N3x9y16PPiMXvYX8ROdp7JNEqfJVFcj34B6iB4tnqgdcSXt0vrMTkKrxf Py514K9Nyu+bQtH0PCL9CebxqXt9rzllq5e7jI6JVWkaldmZuEXPFLoVvy/CdHtwQxQzbuTv hJm8he9IanXrfcoYPoR7NUiB3KyY+ZZIj6uuejVbqqbZjXl5ZR1Zh4NxO3NAeWYgT4Xi7p4z z2V1zgNJ4mZG2NtMJzZL3a6uql8KQwk86mCfz3Nlqeyo77BKtpS21+U976z2pmYxfxIiLhjh dz35j0H/Lof6cxZh8/0TujlWVPlHNXnfPVy1swckS1GhyzRNOw83Z44ebRbKfdR1pxlsOfpc zwc+WX0c5M5uOPtx4HRZHmyiHtqM5So+bb4+SXR62KaENvMzGfqth06MfW8Z+V/Docy4q2Hy 5PT35pwPP5OWPx9evjl226zD2+XrNLbJZ9tu7/bw8kOvb8h3t1nffJfFsq8tHXj6Sw8NOe2z Vw8eu2nDt8zo5rrKiVJW3cI6fLSyqt+7rvxuPXidzapww8MrL5UEUnZfjnnFjxR5OTs8Plgo PZd7OvwnMVkPfajk0U9cdPd7rotn6IzjbusCR5fWbzry8Zg82J6vtSOW3CWsaga0dRK1XROX wi9mbXVoyZXAw4znN72l80qX5XvH5ePx/8tngqSz6LdXzEaHgX/eVD75ZM3RL1P9m+8w+dNC uD97QfopGb+TNXUSw4+Tlk3F7erSjjj0aKtHtu0y05/b5/Mh1A1ZtF6IsJEI11xh2J+fzUjB tkqq8NxV5LtELDPG6Vj8LRX2w8quyqxvjKNULIWmf4r8PNz5QDPKy48i12a8PIUz879F/irL 4WxjhfpzQhZsq2VaebSLbZVgJEZUCa98vZ6Y/IRvzuPG6JAiVa8+GfTvVZDku6s9j5sNh7ev X7s1Srqjz6eCGIowTBwTT8nYZyCT9H3OlB8c/FoHqaYpzmyxCpL5WW1VVoqrVnwjduv2Ht2Q r383XGF0bJ/Xx5oRXXhzwbmXNGdlnk8W+7DmzbYVwv6436NWNz56k/spbbZChzVfLEoLOpXl 0PCuzNjZfnr6I2demt7KX0IPdbHfnCPS91ebpqn2olC3H/PNqs/XZQ1588LRaoRrWezA1adc UlPPA32+azZbOFyrSswUkOfNCy/bNpokdazZ0/k8kSzxC+nWeP19GbqzVYYBe55oeNRlcslH dDriVZT0W3GvTG565xOuBrUSiA80EbLWrF85ItMcLMv7Hxvula2dqqYsNhCovoBnHwjbXrnm tGYvsdh0xxPf8W1p65Nf3wImDudHXONPd0aOmkH5BG3f5Hrb2ce+JpqHXYXZafJPK3VXGemq Gp8Xn4LLbM58/Lr5+LdiKDdnP0aeju6haDNptnEBXX12F5i7X1swOES8qiNZsu1waHVKG3mh Ts+eEvhn6OiWXU/F8YrLyQ7aP99nLo1ZPoMl5FZ7e+zukkIvT7xd12oH7PzUx/LJJQ/xv83G 1UJQZJWMEWStSLKDCfzRAMmELS2swymUoiB3+0jF/t1mi3O91VVSqrBVWgH+Mh/iyf6M8fn/ R/VIZGeKHZ9NK/fLRqXxzEjvyLt1gRCwaIMGJin+UJPLtzbCUlEQUpENziGBQzQ0EQjySZmI YkAVTrx3oPLgxZho7SzqZkMslIxzzrTMRQmGDhSYYmARFBMnEjkFLQ0NDd2AeM6YpbMySICI OCDCpYiIVqgIIppiUOda1EFCk9NhcbAiCsBWIogMFBSJ5pLGcGFvyFDofHL480Og5wt3hiIV hRSle7cUnXXOHP6fDQ0My92BxOzlcJTWocs3Bdw5h/qu5Qk9hIZYiI9FCJX8ZjYp6657cKKD FUic8OXoRUbTiZDHnLse9PI2o1OoNiCxIQheXpSmMmsrEwoJ7WaB3inATPwTbo7XnE7TZ3M8 RqJGjUolefGYmg56Ov0SnulD2xRyoSjKt3FiB6exFDQEihHnhdyKfPA/2Shwx70Hmh8cwkkg szK8SIij1yCeqHvgOkvhB7pev/Fip2wh3SEzhMxTS0h6oGgqmoqEgkPXKJ9Uo9IP78vWBehB SFMW7GoCZfGdYD06FJ/sQ1ibzlBNog7kDmeX8eiKH+XlQDhD9mAJoREuKobIvWQP4E7KZnaW YnriAaxNoJnWUDkNiUIEIHvFREKRWkARCQhBGgD24Ax9ZipSBDRImICQroBExEkqEgSuHfSA lH3Yc3zUpcMxBOfydv7P1btmIfV1wjxE8x+DuETK9+EMCBCkRaVFfIAhQwPCFPvSiGxJR96D CGAY+cqG/2tJxvrV7uxqQ2cEejv8IakWfFvP5wHzr5f0eq/Gzf7Arwf/mjP2WOQ1W2fsTavb pJTwj/YGQfj8F+f51Fl1iBVKz+KRBiPFH9mGMHf54ad2vgzMfhjKvucYLF9d8iBypIlwOB4c sPqIwhpIn5fx5Tg8J1Vfv69d/9FfVnzx5fuluYOBLPqkot+KJWvQr5bpy/CqWuF0IW8Y98ua f7qo9saLUJa1JoNBR/FT2zUJi634whh5NYwwxtNmjUSFB35+WPKe6VVCtnlGKeLHBMAS5fS8 TT6x34bTht4QJa9qhJ0uWPGMZPGMd0IJcOWU4O9lQPx4clvL+6rDPyaoECGpuTvnxk0JTUqU oBCMYM3PL+89PM3L4TZuNvcoijBdIVHXryJv8L2zetSoyQY7afVbVwaIHTRoe6ODxw4iPN0D mnYHnCsMdFu595rjYWjeWOZQxEzdXXIlzn8e38z0Fp3+F3or9WFUyc7qYK2MMJOzY1JoXWOo xnfGMNbp+/besU+1jb71EU8xGMRCOs32hmtWmblB2W95nLnrnfDium86zTlm+uOSD0TuCKqH qiD+T5R9K9YApPPcYrpXQwDLSFKhUKiipUiUopz84u/JNes/Xy7J2d2uN1nu+bzdo8B3sM9m zFmD1pgOZDH7v47be+01dp/AYrY3cnCHT66rU7y48/T5E9l4y+HjRmAamVaX6v3K7SQXeih8 UPEQC8eG6V49Hf4es5ef2+9ly+WqN6qhvAW1oZJmSbpEbarPY/CuMwPJ6MiAJDAcU09uCgAN Ct2LpYEOFfFGj59827ht6ZuKGa5MM3Dm8V+eMl4QeS+Mzra4+v6+N8Z+l9Rvm9vzxxfGedcc bblFTPzHhmYrQYoLuY2IOyoKxjkKDMb64SaXOT/trv5bD2adxyFeVLtR9GQqEi33X4EW7xQq BW7HPcQJ+yfiP0lcR6TSu73Mlm7I5JefmK55kaEoE6MOd0PR6R5pVYtWQxj6xvNhUcgd57jF jRdTIthX5cM5rWeE3efd4wAnXAoctVQba9eKey4yNDNJv5MwXUu6CFcS4pCVsDYDhyaTagV4 QzrLdbmnf6lsLh6sKsm0yKCNyLWgGhx/d8gRIIMCXzeqq1iYOzXUKMa2Icj6MuMjnLDSHJ1d Mi8s8lx655rOXB8IuYG3OSqjmr0YsEIORrk9tVtm0RLT6+nMHYmZJguNmzE5zNHNnksnw6ah qsuaWiuEa5bpx5iFzN7hE2XNM3Gaw1FozmGZjnGrZu5sk2e8vk2fGzWG7XJsdaT0NAqvEPE2 Z8COgckPLHYQZ2nW9C7wCRM8BNHko8jPrOWP8zVq1hVuKm6/iMGz48mxjglFxtTXZRIs0IFs EXbygqbVSu4Oa3WdM5UdnznBMQndJfHCZPKhFPAo3Fgb6g+hk/kJno+KMGPOm969nnlyVsEK eP3UnH4ruhXWod5QRfQmnnCqESBOE3lLuh6M3GBPqLjEK1qoyTnJeIuDA8ajCmIzodyMRqXn U5znDTVaMxFS+Iq8TrEPnI+lpYWXLzdrGZy49Zp6ysTE5d7RiauawU9RSxm7vOljWsNh9XML WNLL06JjEPkqKsmpnFvh3ervExm4fWIdVeFOqzRqFWKxWazicaozVLETN0qh4nFUYsg1dJVO JenqND0Oqh7yo0TnDq8aLl8qr1esXBjBFLVazcjlRUTjNROsU7TamMaUTTrQwJKCM5jGc4p8 Gh8zp8aUDw+R9KotZkjN5rLijFCm5zb6fWpvLlYJvFYUazhTnWczd3qIfMazjUOkairw73Q9 YgzDqFN5zjVWa0tRmU6jNJYsfMvgesvrWMxnNXFU86vWVUaKerMyrnNzNw9aWMxiS5uR4lVe KjOqNFzelJC1rS0+Z0Zi8zL2sAGM4gzKuIzUO+M6t6w+ZrORPWYxcVFRBN1WcysPUWi8voUx eKzmNaUEFTm7nNw9YxlZWInNKqoq4u7zKxrsmecywXpVYdSuLyKbeHyU/mPvgCKeOA+k+SwR O2Q1AUnEighsJVCE/XIJ/i/nwKKT6YxZCpKkGIAZHQQSL/UilF0KURO38W4I+/IKFZInERIJ VDBBIFDFUO3Ps1n6sN60E6eXeaAB9/2ihZaMihpAekFkLgBXwUKh+7uKDuymW28KyKJGCPfA f0wRNEANIfFOEA7zKAfKT1hAh2gEgBQRGAyIQv4nxelXsoDoFwdx+o/KAUDjCGMlIZPzkQ/t KpC3YzECjP6X8vFSI8AyOkNWBCe8wnJ8MGhZoicKCgdh1w8mKVamLTgfF0U+6Uik6ggI4B0a TJTA2rkaDeUNjOCaCacoxGOR0k48Q32iB0d1XiBHgbLRDjSbBXkl2vKL7KtapCgU9afMTGPH o+MUZISENNoDoZmy3JNSEIbe+DbFHQNpsROIykkhIhw0gSIbJ0wO5yhuGTUWOWZmBgYbR1Cr FynAPWZaCbNDJAMMAtLrgrmG5cjLIdaziWQohGSR1aQ0QMHjN6cgIhQcFh6uvttrIrIQSZUS h9exMSBMQG+B0/aNAJWhiWwZm82xL4/N6Eo+g7zZS47ShI69JVFN2lhLUShWNmjjfxzdJD42 NNbNVU4m6qgeEfAdmzZloIOjgWVs0ECN9g/PE66o3rpBjJqDVB2Sjz65RZhrhx+RbIE+Tjxf jlGPEursZmOB28/I9Pp+Png88w+PN6pRBmDU2AX9FYqHikYX5Vle61rjNU74VLtl0fBPwsR+ p3ja3vHSfKen9OZLx+OJqnVfEGCy3Hp0dOFtc+NSulvIkFLdSBNbHcTTvH1xED7jiGAj09d4 0/lnGri/P2FcNesRP7PVaxCdZm4x+N/3cc+Y69vr2nHh/v/tVzfi+6rvRfUTDvPmoVKouM2/ EorWKUEQaVI9Uz2sez1r73HnFb0aDiN4CZpa5BELhzKvCBZyUl4XDz4BBoaaiBl+skVrB9LQ +mSZMeShheTrH6+JPue7CBdUYv9vd0tuLtmMc28uex5wVTfdpgB64LPd/eoM2U+NH06I+F5x zZ4mT0ObYFCRGCQVKowKWwuC0suLD6uzqn9d4UCpVXxije2Ge4fXh+q2qNtuWU9FMlAIGlTy /m3Brgbb3s7B+PMX7Mc944fxljBlGAVa/d0CgQhAIBigJYlkVBIh+UoX8ZJIymoUe6FWCDYt LL24TUJ4jAmxQGMkOqUBDKCdGpMNAaEClrGLVESwQx06jhobqKhbEdQIQJMa/t+zT2mtOkty NiioJBHUtDeL5GYOJ5GbwZV8MxUe3hcF6oS7sd5Jfjk18GpmTQMxs0fAzyeDwadDPWuG6kG0 tpUS3DtQakIl07p3vTzMzAQSDSTYWEsNQaM6bQSUDaNaDQSw1Bo1pvh3HDLAaw4OhITlCRKl 3yJy1QzOYmoap8PiOz10E0ByHZmcMBwfdgoDoDqunVH6EO9GUSIkksKI222HQnBG9m2pKU3f Idzs75wZyy5li2stTc1dk+X6nh/HGPss5wqF2305andbxlM9cmO06srSrQ1Z9l7i+A36zYmX uAwY5SsLhh4sDsax2YixfobcMXZ4mjlKgLdu3kbAkByHXt7Obl7O7teUO6U6ovVXUVwnXU9d dVJxrVVt6NV7MYNoY2L4Mq5TpYBIyMITROh291Js6Of3N/i+pUe+B+qUQP5f0Bh+XAwiABe6 x0TgRAB/uQES7ohESRB2PUrJt/w7+fDB/Cfeq0gh/6B/HDMOm/eTA8XRmSlh1/G7Ij9/L5ot hElw0CtRYlSSlx3qISw5ShMlNRcTSuijonnyUBZhIHnx3g4xpYU2weeruKqhooohJpsGsc/u GHtwUxD3Fp6EwUKhNJ+R2NKBCJcu4iRbknM0QFlDkjFV28nb0bsprX2cYG81uid7rXni3W1L FjFeO8PNi0Zu92nfjud0WBw8cHUTD27VRUBGKxFJWquUgwQ7jf2VMMZToxOFTRQdLCXiyf2T udTENoxDimBfN6eMqpOQ8J7WstklgKwJXEvlrF0/wp1d/v912fB75OqkE2H4FAyirD7pbVbf shHneG4DTGr5zf4TnwP0P7qfj++RERCSZHSfT6YRSZET5m+n2N8rfF9nnCR7T5eeq/1Ox7Br G0cc9fHyIqqqpZaqCCpmpmqqqqqqqkZGpmpmIiKKqqhhqqkkqqkkqoIKqqqqqqqoWFqZqqkk qZiKqCcQ9iL1XsBUMQcU/c0qiWiIGga/TuMYoHXaSt8MJPqibz77vFRTf1qn44bBCxOjiVpf LBQ3mpg5siGPnJaYOSz6fuGxXlNSjI4ZHAUOgT3fQeW0fNXo2SfqTADFwMM/3/7j/d/BxjP8 xVlzB+G7w+kJOZLUI/nqJUkM3C/rQ4Snlwb1865nKbHLzWaCGQO46lv3mhf2fxewUAbCbavY Y0SO+oIiK879nPnrjCY/rfY4uKOUJJJDvzw5CSIRYyrSNpiDu+3uvHrlhh6Y2xRZgfO5QtsN rAXhxZiImud61r+RMfyI3lMC1shkkmP5/Iw0DCZiBAwaEM2PVevX8rlH91EFBPlMTryLKaCL MxKOE2Jxnn33phn+Oee+PGvwvtLSunPfCmfbX0k8xuA7CesXbgKbkTiUyVOsZDDuP4+M+tV3 UvuJU5Ubik0IJ9zlryegKQPhR7XPbU71jWffd3d3fnvWsXuikkIx84+zG9M750RVNyYPCHWc Uoe5Cgo4eDouTUbOC1OA8UycAqOuqqhw0vYkx3QzGCYG/W5Izl9DvzEvGt4WFzN2qeXjOVjA 6eDCp7nGHHT0+BXh4t2dcvJq4qP9gPHVnFcEwwnG1GmcWyENJ4IlPw9bVyGAFo5UIXNSWwoS GYABdcfqTKCYd5PSpFMqMJO5Ed4OeDsyle35XixFIdXwY6S83OVP7majeWGRJMjREc45FEMD IS1Kx0ZKCoK4sr83tSGEPN60JROnZO2YyatqqqKJBAGCRgjCSKSFZVElVYB3IAmuO7nXZinf h4l5h9XIaeSBCHKLfEgw3aHQrvM8TlhCE+v9pg82DH2FOmTEkDhRjRKPGwKZItsGMykwMbcx VMLiaVYWsSIigB0P1xN5peQuDIshFFNREGSHKUNUbaJakJIIQnEminQz/BGMRxs4BQNkKG95 gway85ET5dUQUREYmGojwQoILlR2MqqsVViLBEBiBDBTTRRMQSiBCkJAgRMAkQABSiJS0QMC EQAhIgkKoxKMkKBMooUEEUgNMqyyEkEskhQQMCTJSITITEhJFUyoswI0MSNVVUVUJCMKyMDI xLTEUiFQpAAUjCNMEQqMVUhEhSqRBMoSwsMKQNEQURECJYgcEcy5Ms6CiJ4lno8REJ/4oOYA kWxHUL9R5yiif67Jogipsx675Iwaq35N5140tgSvkG9EgfPeuduQ2qe15NzV3itR9t+/IEhy Xnl/XDCcwUQ+QQhImSbLrR68ad3s9+NH1yaQcAk2IPhMtGTL24LqEqdKPLNGeBSgJXQJQhrA KzSm+3ZauvZd1fJjDHQzBywMPCRXBEMkcS7CljiDvAODWfmeIcd1MEAsXwupSplhmcfXjaid Wc+PydTudV+l9GOIFTJPDJVYdZurxvBlN4AMDw92pi9PDjgzAOikMwXEFudCyXah9R4lxdS+ rR6MzArhXQn5h9CKuYuY0uTrRxCnQtNpglRpsAMwD2g8YahFKqWCHJl6dIdmdHwWNuXc/uZb LMAxzhIFZkIWmxVBIflSoPi187Hnvh1t1ubXnxXo7VkERGCk8WqAInj8RgTrwj8M9YeUmGXb j6jNGMWK9zprZo/T0HjyqOTnv8N7NwJBCDEHt/3MQaSYF4LZckiIdSAFfxkKCbgIkTUKkSoI HwQoi6qUKooKjvpSsue2wObngech6IHpCurGAplr4w7qBNEv7Hcdz83yxX7+nwyzlPr9KoFl H8uZOZ1jFAbWEHCYu3axFCR+m7pMkkI/pKkxzit/3aY7Mt/33SPt5duqc0mYx5PSYNq0zjtw mmWyd+0HNPyq8v8G48+RZjA6Zm4TeUOeSK32IPHA+cACEEXWIqyBed1fMW1v+LjewCH9G3Te Y6wNIim8UKnGPLvzphSSISb5sz3YfHfN4ygfbNSUAwSBEorCsIRAuR34W2EhBePPGMGsA2j/ vQQ9D1+z53wiO8OvrnGcVLwIzUWurIdzm+3xSO54SPUqkQZcfY+NbXJ0DuYbtJlkqut3Lu60 jZbANCb/qqD/tQ02aUdDKU++A90u6Z0aY9yj6Rxv2Ed9mHfR6vtVQ5S1h7NB3Q2T2wflNk/G fQ/bXYhtgOs/F9df7UxOimnkBNZhIgkaBNOzNIzbptMEBYvWgO5G2w+b5nKdhIAvp+F7+kvq vtRpGtYch3eEJLXZGOvjF55nJeBgzW85dtzGXuXWr+IwTsnGViuN4+D9HT82PRm0+L3/qnn+ g9B/DDpPbUtpbS2ltLaW22ltLaW0tpbS2ltK1LbUbS2ltLaW0tpba1LaW0tpbS2ltLaW0ttS tbUKCDPxWWMVbLWCqqqWlES0p1/huNOWWWhQYiIioiIsENQoWwsVBBBETCS3jqyZY8H3L+/N i6mk0gKFf3mPiBUmcNT9Q76KHBTwlx8XX9ec5mAoaQIUdMGDs9/O5vsfMv8vA8Pw+PiGl5vq BuwR0Pw+i047a6Wc6Eo2Ih5ROsDYUP60BUNzpqFFZ+x3ZCYhOCRyUYRFogRG7o+74Tpr4O1c WU8YUOxR7BCJGVZEYkJRRKGCIoUgoLQdXQCgbhQ28LfJz/AEN/F4neR14QNMB3IoxIqnRVDC UiEFI7n3u6zocjCu57vfMTvpoQ+za+OWj6se+8JwiJdmZm6w8CnEtT2RLJxsOrOvHWWZus1z rbPH1XWevhZQ+Qw7vAzeSBRBCnB8pfjuJngBW1AWYdO8W5lAyz2zyHvkKsV2GtcpAyJsNwTG 2uWnSKgOhdCgw/boeUfU8wO3q8MzA8euR6xbtBn0Gnb2mZ3YLGGUbdREZlVbRnV+jDq64l2r DHG0YyvjkR52vLAJqYGZrcj+053h14d6+62vjVEkwRS1B7Llk7NqWRLGp0aQiWSzPDQZIa4Z rnvbURYkxZjvh3zvt12bPZ4fSeEcJ1+0uIf0xzJmhxMhMkiGK4G2b5Ac+vIC6pdTEijtOXfv PZndVe41zvg6ob/LjicNDibpxxT2j4qUvCend3dF+JqeHcmkya3BF+BLg7VWg76QgDTY6myL M99tmu+quSej4fH7usnyu15b1EQCoVILG75dl6rVr9NvzXjk+Fr53PtPdy/0/kOG2fmAeEEh DnIMxNkio3IdoSxzwu14PiniRnmI6ppbIOhUrx4bdKdDUlTF8rqOw7En8QgPc6u2dvB9PUUl +bWul6aPLrjt3UVB7hTgfhcK4iqZmBVt3ZItOlpd2kSainfFGJiOEmvtMWpiobJZR1Y4pJFH dISEmSId5ccvfc3y3QgiaR1rrn6N8+3ywuz1y9yJLDu7HyeIm4eXSQjt2iUSyhmA8tlHuEy5 qcpIHqi1EyhUSS+eu2H88R/S/XsfL2UIqbQaSoByFdaUP4OEPnOUfWb/wa7eQ6tp8/c2/zxR 4fVD1aAukg+d5iPnhJ42Z98FKGQzwo4qnetwhDxFgZbaqHxmWGGO5y4X4LVCvFCdyF4X4evl N+Otb48Qby7OTP6GPnTKe+hWTw3dzME3YctFtgyGDFB9y2se2l3l4r92+XdbBgSCJGmbCg8K X8AXmuaSJGKd3u3yzJIjLrfm/GMS+FS4rL4jeDNTFKqwQ6y8Y1jS1VXWqznUPqrznGou31VT lY1OHubuoqXfTzFsgj6uMiT5pvlcax1P7VmwgiaicMstJ79pgylWI+utgi/r2eJo83L4gtkF wxZhmJRTAhsc8OtQVDVnMG9Be7tPb3Md3k/lL8NzbzrvWkSu+oMkRSJSZmRm0QO3t893dMP7 HJ39NW75qViY924rLtWIuovGDulnEPNhgYeE58deGgRixyWV8ml6nVhU09kefZMyevdHgULw oO6eEOhJ0URFSZJfECgFZiPuGHGWJLCiVpcWbbsXxjFwMoQpJ71KU6eGTxEJEMDMy9D62Knn yMnZNw9HSjeVx5qLmGsf21kqSoeMQths82EEE751vXg4cd3lsD4oZpQVMVC5eUpWu8dA/n0d gEU0N7U+Pn2z4V8r5zp2US4KoTv6v7YYXMEDl/Z/TWFNwEz2WaqmJBCQklLbCOoExEuDSxmx q1zjz261yrtlBwPqdcMcN7gZN8OlfzHPl1LvAIh2fiHODoXyj0wl8+XRQE0KYmNz13xOOJmF 3h3Wx3EGqZmGbKS+BgfedSbTnePgKJQ664OyCk61Dd5utsZC2LiISyNrIOS4fAD3tWnijvTN GcUgmGEWAQUQ7dsE4+Va8Ma37JjSVc30zi36bpieMkRd2YBiyuLTBM7hTwtz9F1Qk1dbjphK FwI3xzAdQkk6f7nwlti05uMxmcOMG+MvH2k39VEymGw8IR5rgFMlR767WlIxNyTL15SW7ws9 EvjtGJMXnbzWatYjWpESaHxi71ZkxKxKWqia1OcXmlRAtW9nYF2g8ddmIzfKAMMeyrjzokxK PDmLecYLYeZ40Exk5p6NNr7Y6p90ZnN+nEUvgARtW5LuKZ+aJEvI6rzDy8JvLgvm+WYtLkhF vdSEcHIWvORGlxYr29jV6Qi3VwkQQQQ3r8rf1TS043FDRiLJzmzZ6qrG+fAYhOsKCrFSzAJh mYM+noGL23t+Cx8eJ5rGfjeblO7whduTn3CL6bnxh543fvl+UPzUO8ju9ZmUvm6BcnFovO3y oTpJJdMMMRPG2kCDddeg88NrZCcSl3MjMwTrxszleUa0OVkOofF3DO6SR27FTXbtpKM4Yt4l 6IeU4uxZwYxnDTl3zCg15DiYrB0NmRSX+ObjLu6JHSN5cciwZgQzMFVsY5tMe9rx673p79q8 s2nmY/J+dBG8x2k5WgTnYgH+HTey/VaTd9PCbKS1Hdbl3MeMNngTnZEZlBDuPNtBPAh7SIet Ummzofe5kIOSuWPU5DRIzFuNcJiTMxiB/bSUB+U86FGc3PKWU98W+JRVZ87yqe8qDWtaeUpe 8VUvLq4M1lGMVc5rTvgjN5xOeeN0e0QIECBHnAj+1VfoSkTpnhMnDmSzO/d4wr6Te8xJVlBM 1ImiJJBxUKBxFM6ekcjtME4h59CNmlX8B/R7FiN2Mm15NDfCW8vnGEOB2PjQ+WxR24x2VNWb xiz4szA9xlM9zErTu11S5dynTa8K7TgQNzFb6elUXvvdtW1D4zW3aUHE84oBj3rmk4BPPDwn iC8VvL62/F14m9BPsewxGYHcMjFwaCGmsYJ8sLJvfR0lQZsDZQdeyvLGpmV447DvGlA7cOh0 JiEdhDrLjpBVOWtQ5KKZFUk7McynZgfAsQnRw/Yiqx1cGu6rgzCoGOTnUNJznbHZah9kmczW 2jNZJK3GvFQHthkqnYqs9NkLmj2IOavmsNY0SuNpdnzxHo0y1+fx89cWmTw5d7hPYRBBAP7Z 6b4s83j3T7UU6c1xxEsl3ccSFrh3h2ZIrLlWHfMeLWAwL8uuUpzcv1uX19IJK92iIushixPQ /nQ1MzaMGD4s/rxPo8yuE2IRq7LiXrjFInRLyVInw90D6U3UaWo0hPOcVnkk74Dn6U7XZKuH nw82aNY9ShczCjETMm5Mcb8XzZuWi9f1Wtc0G4vbMeg65eSOatJC/4OPhHecuggaM+CgVFY+ tYqbFdY9a9/rjotc8+XKPXG0pfnuCK5np564nXXHCc1AjaZozeM6l6gQYrAICYgURBdYJLaQ oMB3emIJJEEIeTbPs+72D26wxdvr2tJwOBnTGSohzIpgxSvQIHDi/4FsNk7pCOsSxlEtpwfT eMFqquM8logzjyPGcaIUGBsG+gaA1au+6Nc4IrGLUEQKKDumtuyUwq5h1t5Q5klhrgGKTAAs psgSzfM0MT9LnEZDheDl9nHEji6ynVuR+vBBPT5mHiUF8j5LCB6hvls17m/Jr0GYSEjaBgVo skWSTz8Xt7qoPq+zkByO27kDz5tKo160FwNvToYU0IEwztywfXmggRkCCay3Z0uuj5S4hbAZ gHPkKm709Z17HqfCipJwsPeHsl8Kq8POiSca08GM1iHVGk5VYtZvL1jL5KuspSafJm0fbNNQ MwCV9+xdUfDzhzYvRg13YGUjfxpAereShuhVOvHLCuF1Ouokit3mK1nOgxJ5XvKPZTdlj5Gj EqnZ3flK4zIqNDMwePf04T6Dn1ZO2euY6icJmYJ7jMF2qY2ZO3ZVMo9udy2rEmTq2dYde9yk vW4rWCLqc62+1MYxhCefHbtrmpvabOktrHJUjMwdohzKdzRZDLdgW0h/nphgbHZ0jnrUcPVu Tnj2WnuJu8qmeTRdB+TLLrNfFzipUVJF1gnGLUWkE1I9+Xbd3GQbnIM4Bo7NYompCEAzg3BS ootxLm1wOIla0B15mnX+lkhLwnSt/HWfLC2q9IhKh3td4pbm3zcZ9l5zrV7mLfOERqCo3XrD zWc1t5N4RDsjdY83qT58U41Wmh0Nqe++yu0ksqVtiQe4UtXspArJ2USyVaFse7XHM5rd7p+r hF1B52PfK3xrm325wr1mlIDWmG+Eu0Jp56Uqt61HF62rce+9xi63VRn7ozMExbMwKt25h5uh 65fvnZVozl4dyuz1o3syvUV68n1xvx11N9b5y4qPNbMocZfU5wCYLaYKhvfw/7t69fflPqO2 W0QP4py8A7IsdK9fjPE8w75rA1g1wqokiQ334ehWkzQd8u6bjsiCee6mSKXCopI1DNVijufr hzQ+rG5c0OMJpI3uWhNZJKZQvO+5WMcwyPbkd3LfNU7vfdcPvGafOTUZs1V4xRiH1pRjMXN4 Z3KPOjVUelczgYXzLKvR/0II4aLNBHL/nNd22WZctOyM2imbrNOcz6qSm371aRRb6nMG9aG5 IO1K4FR01zo8emIDMBCGQSRGmXHNVK6gLNPbsqV1qhp6rAsEGh2HCCAgQRJPwUo86LpBIy52 RJtZplIYBIgIgmRj+rA6eY7HuPEx/d5+mLbHfzxMHmo84UvBnRwey51Kw/Jm4tZc44AZgL4z gvWdzzmiKX1rmJfRpU1KPwn+UhiOs8a50BUIHv130radNuPGseJy9VJW7YKhR4yeLycB+oTU 6Djbft0z1lVjtrq6GKaCj5qSSAkQjDAE44vywk+BnTIQzqlpNClM53219u/6PZvUZ1V6SMy1 0JATKpLjds3srbqJhTWdqzNzT4xXEKAUYxnGXnKq51jGbzhXFNGdJTlU0XWZusw9VhXMTiMT RrM1gqcKzsPvoZmDKr7neDzT5YPhz23xyuYfWfWGShlPhYefX8BeI9pBQkm89a4859Lxt9Np 1A8t6u2tmYrh0sPDfg8E43ffO331i88alM7c84k3yuMb5Y2U0hDhrXOvKox1sWiiS03M1fFe VOHD8CjL4Hg8MAZr52K6yZSqTMBmkw1x5m/yZMuH+v6nCTebVTm+/+n4FDoLIHsiROaEPqfv 04tYxH7D7rBNHlx/r919QMFYwqwYMiIm0JCS/z/USZfjbGFctOm3y5BmY2BtO3l6X5YHVfHb 1V2WE6XlTFseFr/qGS/y+jbpl/DU/3cDil+7k0alQ4dS06hzVLVq6Jk0Sbjzliv6brC2s/UU G54CncQRaSzkHAQIEN+CGeQ410Pn/z2wCY2u24eVYMEHoI/vEOqMvsHHUQRm8I2hWVByD0k0 XOR42yoP3PTX/iVUi/4f6Ycch7ZueezXvC8mxr8Z42QgYRBEfLhY5TEf2YmVBAQETTJKFTtE SAhUI+yANfnwU5417/aX38ZZEN7xH69+qVF/mjKp4oXI53JO7VIEiPPCB4lROpKLudUO61js lTN7TKwk1aYH5P9kApEeS9Vf+qGCmlF2yi+pC1dlkW/yEwkVVuIXn6+BClzsf3W8fTySaSD+ qanMr4Yruhdnu9GT7P22Tj/stshqgtenyvsNU1329MtmUWpdf4752ckzrlpDRG0Qfd/fjm21 zx+r5J5+CDTU5SvthBLTyQ3oI5qQecJRSPBYJnW32vj227VE03ornHFTUBVQhB4vVpr2RzK1 NYsFE4y42nPO+u1UiN6dOizLTEmzb0W9pyUy0Z0qaUMakkLtfvRBMiu6DZ9Nm+2XrVaJXOxD NjnSq556FF31Jjuo/ul0YvmUt8KVx55a1RU3WUYSY+NN6UQTHbVOqOAuST/55OQXzIMMztkg 1Vw6u6RqF6eHfZVnNFB2OHPLveuU+jMVQbRp4w55uckOa1iJCXpfVxo1JvfC9dmuGPLpXVvp fkJSXGGVcXlVF2ilODyQx7MUdO/qtA0ralFoaa37Kwv2Ij7wRAuJRPXJ2LUwzZhGlaZ1PGMY u7uhP4XQjl5OfPIJJCQqnU3tqlZU8l7kb1NcVipi7XtIMyEOtagm31p2khrU01fU7VIJoavr 54qvy9GbN7DOeS7ALQcMTF8DqljG1Zs8eQskGkJLpg22jstbgct36yGxEkUXprMdSfFPjcSe u5tZZsS9WGe3pid36cy+yeqxWnS91k7NZJWKdOzfLhQPXSEB12SjS7H0aJR080/FT1Z51PUr 8OR9y71vTcFt7NMoQic9W7pjrIuxzobBJ3EgzrRnvZaJcZjr1q/VGfllfvcyz6dkSznteDEj i2k3w9EmKo0UL+1wl5GHb0odQeMHXqEDo7EDeRRqckU5IDw9C5ndtXMc56NlbG5Nbx52bZq2 z2xdt6gmqSWGzdpnMGobqU75n0v7kP46lqrHw19yPsgb3CNekRjv2M1Tu0VAqRAiRgC2OaED f8PO61JJEIWN8Fx91vVkzUSBj4eRfTrKYiYnFalHP19HLkVwdqMaGUjW7DciZmrLUk3Pm38U Rxi5oRUIbF9TsYw24QIsKcjRVVq1d+2yvCy1AN1Yv6pwSvvgzvKI3bjGUmIOoPEiN8QXOzG8 WC+JA0VmXZBx1JQz840NyI66iBRLw9RxL65Qm67k6p23Wwj5NYzs/mc7g/T0IgXZ6QCpGVfb BoaKdkYVub98zVHVO2VXZD1eXd7vhohiX3Wy90VJ29dzzwpo8cu6nm0+/CfQti8ExFUEX3dX RfIeTmfzZoeda1fV0hyWSYbr53Q6pE6Gk3lIHquwz1JUZq1jJ8UwX9XlIIgjBQSQyv3bPQGn Bpgjlr4LTKSaaiOOzA6OU73IorE0oO28TeRl+He7NkebmgeSuXXHVmUOyL53bT0a4Fa9rv1p tO2Pj89k0ovLpxk3Kg06RwlV88C5lySe3l9UNabUIkr8zz6ud4dC+pFSyW9/GcyjC1W0zS4I NaLpM8EYq63vtEiPdql3Yz7vV7oTE0G2OM3L67jR7L/NXVri/DTtPVjPqO/taxw7dfR3IFcj +6fsRIKjIxZAIxQUVQ0MifxQKeuR438hw/reEU/2mQ0/8BgZE0R/bjVYUS1Agf2daEvVn8/r Bp+wzwwLPBSBuqWJCoiH+XnHUkkh/2ZMHmRucXQA7mhV7OMOJOCOTFxTghOYCk62yFMhMJCY TkjkNHGrHJNktCxGicgH9rc3JZ0tt8G0hoAiSMWCMngYQYooyMVJh6l5hhi08ne/rpDpO3Qv OMccpokWwxQhQ0pToUawG4XhEFGMVi0k9CYZDDA9JUOkbTRKdn6t6Gj+siJ6FSB0IOkAHQ5D IxDjMLUpFLDLAVPZFNFOUpvMEkA4hpKXJcIE6yn/Wq29WWSeWUGCd962rKLrAsLQGDBma1mY DmIcrI2fWaiLwXI683pX8P3/h8s69PZD+B9tg194fgv2/x/TP4fjafu0Opx08zHwafYDFiL8 oCh6iUR1+slcE+KDA1n5NHmOyDLKn0BGPyHqhnXyyxDgDS0NLRQYb6qU86gtrhEQklPBdsr2 +9h9uj0vg8c8qnvMAnaoCAR7o/zQESp/V/rFARwEEf61ARi8ke3sgV/1gG4WTuLP2Atrp5r/ F8/h/DJn4vSykf2tT3zEvzRNESt5L2/6JlXlCnX835zPIL7PDOaf+VqP4j3eni8+Uhl8P8j1 fV+Y5GBcVKA9olNhZ+ITwqupN5znAh+a2gQl4oA/FEBoKHH7jtDUn3GtGnXdFVVF0HTs6kiB 12uwHCAWhlzocm2H5vyzcjmj+uA7Mocr/OOgtEhD60YHmxyCO4DlLHQdCh5HbAzMqnuDQdPf oYMk5gTlR6DQwMDlUuqgYG8KC5AIrpriw5piOppDxTsBwPE7OihkUsTgDwNWEn7BBvFFAZNZ zSiBQ8Ex3qUaXIMODCjRUKHkhSkBDXsBwCAcAvy22DBGIAYhPJQYJtgywhAHPAeAqQTiQDiB QHJTJ9k000sLOcqFZa+/DsRiv0fioUeEPm9FFxU+Hy0Ds2Ugx/Fe1ANuEBBNWI3GJ+d3Yj97 HwNu81ocQ9SByHjpzNFJBHr8ZYf7PxYDeM0zGRim29ANYwBeJZxVMxEG+5u3zSOi/xHSIFsI 1R0jPQOjzfKj/Fnxdh0V7ZBoAoOpna4f75zU1RLoH75sD4dphAp3xSUUjSNDQfiOqYPJi1LF SAbROkftDhOwHWAAGTtGkkfDSJqBokRU/pneRNO/9NgP3zdR2bYMcdPJn7HIGds8jl6CBDWt W30kGA6PO4QBAhW4MM1TDDoSSN/qp0ynhHmUZ1mWsDk8X9ALstEpzNdpiIt0ZVNq7DxMwe2M E25f8nl7+a9sEhC8Ngx7Bucak2fI9nrIes4KBP1ch7jc4WH3z45CTLVryw48BuRO4UA3vEng un0HkiJyDS3NQOCGFxj4136CYwFHUgHQRA6G8A4qREPrOvWugH2sB0hDqkGzEM8sFMUe1gHA 72FPIPI2cHtp1DKGEpPArtI5HUinUZIUFOywJoUoaDA9c7nXsUEclxYcxSGEvU5QHGYgHM5h wqESQSDyiiMWciuDtRhWAlShoENWO9FFmD7Fd/X4nhTQbKMOchdDRnqYMDuNkDMsMWXAprMC szNNqh78D6X7YIYIv70qH3HxkfsB8P5+H44kkftKk7oA+6kVH9736DYEUQjF+2k8YPg8Hw9d +jy8p8FESlS+hnyqcEmwEQ/rMkaf4uQx80umPq+RbUZalmpy1TBDldJZPlJr9uro7RWuhRsL 12oEqICfm/NJkAOPJYJ4hpypsdu7IfPr1Q0YTPh1DtALAh8y/ps9uxuA6nPKvJJFswE0D7x+ CKqJmqKCPuf+3/gIPlbWv9luEvYzdHIUh/ZPp0Do6OtEDqUCrsuudDwPDx707n6IwMMIgiIi sPsfPVVVVWPHJ5GlTkgPVowBIVDES+lrpTZurYQW/PEpC4jqkPMYVKQTu7dZmcHJ4NORe7YN Bwh9sN5UYA/Fyh94SzmCS/qEkhtCyNbEDHxQNLHNa7nkTNjtQJMCd+43UCkDoAmtLqD3rvCQ kZEOU8lIg9FJLDhgSMREE7BChfk5KH493DQOxqj4mpvodci2BUoHUHgZAwIRASBQ42aVWhLW CmuQ4aGyXod4leufwyv8WicWA+pHFiZhdyj7P4OdJr7AoKMIyAxeXeD+WE12DswRQ2rJBsNB ucdAh3Pes2vtg/0ib11MuH+JYQ2EYxJ8h4I+ZJJcbU7227y2XSmhWwXn5oP5tT/N0ckz9J2E PtbbtPI2EfR1pKDgo7wP3e9cREi7Mo4YUqDx253nAPlj5y7Ow3+S2/ZLjbUN9jJmR/oYDoJh AB9mxIwD9DRB+5rBSKIEPxyYkbkyGJyFqYiHJMqomA1ByiX3INUKhQYLrVBM/zX1vqfoFPQ9 +Ohcf2Dr+DVFgLD/HTw4Sdh91Q1JoBDqgwpCSyMhIEQFU0JBVExESkCcaA7fzYD4nQrEppRN 6xTUPSB/j/t/16R7PQZSdaiQmzdoURHefpOIGhqQMgObDMJlyQkf2UlKbiHORodGtk4fehPG BVWKFNZDlTTRSU1SVTVNKx6cckaKCkoKKSkppqgoKappKShqhqhKKRqmhISGSAEkhmwG1sze I7u05uxdna66AT9fi7yWNYwMqMEXO801qwt2e3Z00jomb/AN9P6oOwgQh7+IgcFEDP8IG7n8 BQDXZTkd0SCbs+EgHK9oezDQTQxSWCRyUBgu4xVyIczICSZdwet9QoyKHQXzmjEUhQ/IQksJ MgHlA9ZVzlExmfihHGUoBP4ZXehxP0dwekKWADxFEO+YgCpiSH0fjSJLNgvq6OzCaCHs1A6V S3CGzKGwafup2uvZ5NKfoxMSiIipPPD2S0p19z8v8/RA73ve/ZMO3EORPZzdN17g7DqW+tEQ MonIx2CKdxw/phXOrCEZWpx3HLj9Ce4ICmEyA3ddKC9KO4R4DrA+YeFidGZZHg7kDROtnYKk oViPJhyoQoOHyIUQh3vAseNnh1fU+E9t/fLCiaP7pPqZ+aexXvfle90GBID4vfwUpEsKEPPe FB4MgQcRHJIjaPX3u0YLzGYgboHM2IdoKZj2sZ+ZwFrqcaBi+clUdnGHo4A54M6hpjJ4LjCG TVTlOOYSOBTV6BRagkNR6WMFOHHbN6BnhUJJhGqU4/0tbOLDmavd65+3B/VFfu/XJR7MjR9I QCghSgFdKCCDqTMxr5GMvf+shDtHY7OMaoafpnt+hUYzKYlKnOsc658y4HhjViSOyGlwz9/x OoHqphCKJB8vmwWd0ISrpkUm/aATWB0SNzNSksyjAxDmBOw7kLA2oxOoAxBiXcgMI6UbFgny KIbGoyCxHLlaTZ+K2nYht2StKFKEnXADYJzPKLsxPmZ4YOeLPEyropjv1DVg1gKLOFDwFPp7 jEH7b65NQJv8+ZmGZgGpQPnUBI/t/NQ+i+Q9eUO+YMRhReatuvGe9iIaZUpySiMREdLCPfYN /ybMq6bz1cL24YB8kAZ+5S0UQw9z5fmsxkI6LyIvHcmEMGR22HwrjxLoVeggOgQHO2NlFRoQ 3LZtHFrb0CGEClzZOuEwwuiGCUFrgsbL4th89hT2RA4QN3xJ4nuZyrrQpb46YJfgX4ad6nAw DRgoKEimgZVAt6DFpjszlBuR2pSJg0/eID4wVUsaMCPE35ZRfxIeJ1Tch5lIddM5MSijCWSm yXdh8c/mzgs0LbSyNLMw0WEhnBRfIw4SHqJWYP9MUootDgJA+l+A7cex4A5ePh3BRZY6ihv2 ZuOqGwbXqXrsB4h0wOSQgeegADIICG4I4GJso4Y6k58D88BhPw0UK/OqBwALOkBl9iHDQX2w E63zn+rFPy356qOY5h/gIf+0pP/pgB8m5ccDssZH/0T+9/HVn4f1WAvqE9RB9r8i0JQJRHcA PEdUn7Tl6xVSXFiB1TK2rZ2jxsYZOCZA+RSOlQ9CegsDa5KRPICUao/a4WP+tP0sxQYMI24j hF7ZtcAcARgmZCLPQFdlcI9+Vw/w/kt91V9rA6Oh2gXwfyIaDYcCsC5gxGqJtEa7KdoZzU4z DNdswGokKxCClSFrxIfeDACsAepToCODzoXfqq+aI6nwAi84kIYU2KUNnxcin34HgiHi7qhV KGoE1L8UB/V+/VEVdNAjtGJUIkkSr2Yc/DhqXrhiZAFJZlOSmYsinZPwVagdQP1F7wNYgZQP Hun7UKqh7LJZ7b+18/7mD9nRz8VDqnLRILiAbsDW6GnCb4whtB4/r15amLu8Yzp9h4J2BTha JBB5Dsagv5AEObomXk9/oNXDdHj6ct2HLFHmNP0sSR4BfGhzqYEnTOHZQh0+GXrtVW8B5gRT hU7m7KsLSqYJNykSCSEEadsky9tp4E3HZGVJE1ERNMEH3nUh9PzT1mEvQQv8PaocvqHwEPx6 nfy2zsEA+1wFn2p+X6U90EdVTrdhSQu5SQJaVMuJVzuBPN38bBwB2RSxSENO8+uKUGUY0HlN DlwAPZx6mnM2X55Tw2EIUxFFA0+JKOEUnfB3FiUlUVMbHEDRgh0jwTi4Swg9KphsBFwbommC lAPHOQTYfbxD1cgKKE9+hp9YQ0HaS+4mNDPSV4RSgHv7CIv8ZB75PztAqdUOmAbjNNQhVHDf iqlHRAMc+CWJJUJF77Che3WpmX/mwAZxENiABq1SwK7kJCFk72n+47cvI2yVm6vZsoXgdBF5 iBD8P5f1vn2KD6fmpD5SQgsku7F/J9O63+iLrlQfXVA/DITiyHV3ShefsLsJUZvW3j60lX2I /pP0hogg0AfeQhRo7PrLAuyEgUxPgopI6euFu48T9mk4KHENgPvh81YjsP2wX83cdSwywgEI GKpXMVeJyT5sg8+Pz08yw389ddgrD/IBqGyFnpONSISL0y055JTzED8RbSdLoVEgh8e0NPxn JsyVUkIHT3SUSJKpq1cTuGOiWc3UAZ5MMl2HYt2B1HwaEwPFgQgch6TNQM8B18A3SRkUN76W nwKbDaW0QkA4IDJTGDzA8ixvW4DFgUjq+0hVT6WGgK/hIT3iD44T2SH8JI8sHpEogfmfM1Ox 4k6D1gxOsKntzSPlO1rS2SEqEEhuPokw5yfLsLMqADaO81Ov73BF1aDal1jwhN9KSy2xskAB zgKMkhAjCGafMhg2I+jGwDxpxfqh/NCu2h2OTwLC0U1xvFKFTcPMUGaBEDMJkhwZClD3xA22 Q6pw2IO9H8Rdk4ds2z5L/I2WBkYG+rxOgJfpjktDeQDxhlOgFOAjjIC2Ow+wgj4E/0IKfx9A KIck4KWn6nUEXgw4hUTksaGclIF5l28bchyaSUZEpoLA2IaIJ9wPMRaUPNPi+lpNnopkChBJ 2m/yX6HuLsX8t/3lnSeMJA01KISBkWNHfWMhAKMRhT5fgAzD4gMDFMDyCYvY0MIcfewQMCdR UhLIxuAJw9qflgccx1vqz0A9jE9dEIwkp6UEkZ86h+o583oI30OQ6r1K0oJuqlaEPOQ5mAd5 gYLMqd54BgdiTsYQjEFITmE3zm5wEdNdV1FOWAZ5w0DKJwD6nj1DKra7AYBLOZTU4d08BpfO 86mm81UdOY5XXk3McsgTikEC4vNvYcwwLCLnxtea13mqikIuFQ2iGryAbOhgd4O9PR7nbJoB FiRAwaacnUyugUTMSMG29LgNyEH8aD4JgqDNV07tDmmm8WMgXnL1OT2MSNxEgiGdn+Jt4ps2 GbbSAwgxBSzPnbv+YwaHOKCnoahr0+EhXcnCj1vCVIVW0+oDYiD2ijsX4tvsCz4SwkZCPEKB 7XXoE2IHlgIv1/HwsdVWBqnF7HykB/GsKpneB53leh7CQ8pIQISQZCIV/Xs8rThiqGIkrLCU PhKjRBqKBQQ2oHhiGiEDZ2CdZ+91UoQ2R+vI1iTUr2unFKsmDstqGOy+FiNbLjIFdQ3APZ1r YNUB8/5AKYw8kETASJMqJACd4Sg/USr8o+XczAWkOhKvSQhJWCYZRCO7ve96hqGpGgYZoVA0 Ti2giZ2L9DOE3aYUG01zEhBNVBHR+Lw0GD2vfoPBQ+m98kPrmlPz9/ntdkVDm/OUibta3d/Y 5hCEQ5j9ZOqyPcAVG06apmjS7bEIIsbMlGcleRS2QghSlAnX0RqkVxngt+OSNEJvmCs7GzDt 4OWm1TImeIHhOUc5262QNLDA5cc3MMJqYEz5mblkxDIMpiHWziQg7UFFUe9RNHqnkQMZCFA7 S7odwAhQEEi9h2GA8SH2hDuGO0RcUVMsOoDddF2FFyoanTUBW1OYRTWkLsTOw4BTO4gHRyPQ DQKgZJBC+BIAdFlQ7kJFNUTodqUcYvEbGlOFIAEIZpNYpgiHDa8n6KenRNdOuajikohhaCI0 EG9+5HjTudA2SPcqRcAH3Qz/YKVpUkYQV2djRHcYhxaHgPg0PbMpfOC83AlG3SJWhu77OpQk Mw0RNTRM5ByGrlzUMbjF5MExCZJg7DfjGm4iAEyhZvSUMMPAhQYDCUkBJuTaZCDYUmr/OPRT +iLGMIqnzinwPbTREDs20oGxj70RP4GKPqIebWq+Cgc4OP35TmR/ODs6KQijZkBu0JzZqjFL p1tnk3TpsrKTahtavp2o5mspTNY8NLjsPsoW/jbSJN8nZ+R2LON9uj2GjQaOPf88RjSXu7nT sa96jz5Pb1Y48ojGoZhX3kPDN3JV5V4tRFYIZ3ZGhhsWC8+PEM/W/kM44gmwZLfFCZomWlUI vYTAaHKTapXuXqBq8DhhJap8YZnEN7jIoKeg9QwLyIRTAW5swCVh1gyDCAhACFHPKcCgCIbG xooOUcjCorGoiIJgpEzuOvU4Ovw9PRdT0ejRqRWRhoaDZITKZF8sGJ7pM2c+kH8XGw5Xc2Lz ph/UfppP6isIOxJ7i7QoSBpuLQDwHMEPAO8RDYsI+WLra1gWVDEIB18KUogpAgD8mxSlO4Cl A5hzKwqeA8CQqgwfbU6l3IbHsPFLPyYOXpTHTba4p7Hc+RlDGFnE6J01C8aExQBR2DgGWzTT ZlgdcEbWN85DaQ3nognwkAe6HvQE+AEIQEII+zIoT5TzURiP1kQcfL81Vf70Q+1C/xaVzD7h krAoHZSQP8/Ai1oJje3FAyYb1Z+C29HXyQ7Jz3+aHQ/1I6LJXW88fJOl8XPv7KlpwHJ31LMd um8xVjNHejqF4cPlHhMa6ZhTmtHPxhmLh5EbLDMNIXJVc51NendwphUKkcAaGuC1OLtXJPd0 ToNeWcKK13pwIkE3m1Fl3EviHTspEY7Aofi8gOnk+WMBGjkoYCA7IwmJImaNgeIbkyU77/N7 b31yUJabhuijuPDlD5m8ZyDTYKRcRixgYNooGz56BWitJzenltqPioJIiHNvltzlhqv4Im5S vIIpk7tgPSEgSFVKLpO9Q5AK3YHFHq9uMNqtuwuufTqm50XsZ4seDYUgBoXxPKIxgbO92nmY E42ZjYgQWAZiyPILol805Y5xLBDZCdReLBYaPI5xcjE3ZIwhIyQhZk0XzHWh/MPeZcpDzHce KbFlgaXomCKQDimR3LNg4b67yVxXZjDoJ5gfL7cD5oXID05gIZKZBQgn4iH55VTctvB192O9 4J9AwDrWCfRBVCUiUvZH0LH7qYOMpGRgwer1C8TuShHtOR9ngAXyhg9wzH6tg0PUeJ3ok/S+ yVuZPiRTdOjo/BUQ2+PEBPdzRIRkCkICUoaGQvkRO5EIVzzRCyR/iSimGIhRwyReOq2kWh9c 6obragKQoaYiqq8xpQjsA0b8Dz+lRNjqHNNULLCQ8QTqd+sIyCHRUHIwFgecargLkXidOBBA Z2Nfyfc5Bg1gtCZ4XfxD2U7B9ovEIZICGrhorVS0Nh700CBpEYfXoUDpskHxIBRogbufH2GT ZOJzgCZgmDDxDsO2EcXrAH5Cfs+BbMkqOdm979RZE3FoUx5QOfUTgJjkTzh7TuQ3ijHIwIL4 bFbsHEhFFeqaHI8ttEi7BPP2lHg7a6Ij6YdMAeqKBIB0eDZzdu2JvlP05f01NKeRWWoYkCYT JApO3/CRaiRIq4dOPWTolnPqSmvMKMJQB9kY1kIpJERGqqqGQIoISqqgmSgN/YTt91VbkIGw oeLT4R6JmETdGFBAwmA5BnQna5vgDA3uxmkxLjgHBPX5c+9bMTJEfllByU+aXYlosA8sT5IV DdiYtiiezhME6KCycAxpwxRZSoIwDAQ7etWDvzhj4VCZgb6sI6UeNL+Qh/rYQJMAf0/3N4Dy PkPUUk35frmPzWUYLZ222cMRn8f4LGxZ/dX8PeUck9MIQbYi2Y0XLQpB+5ZRD4X/AV6uHJzo dv7NAa46PXeuz/HRSGfTzWgWRUhezBrp8ud8gfyw2wZykpLESIDup+ELxSUkO5piPiy9y/5v W1rejn/cA8bvZSXeyb36akMMqvTVNi2cZQwiOQdpn/JB7UiKudzrRn4Cl9UE9yLmPMTxxs4p 9k8u/kkiQkJCfT6bPOT03/owBIoUnje8SgSJxHUNiOGAoIyUGUiBQ+EEnnyUQo9lzpUkJW9m v4aFuKMjecvFMmc4K/hbtMGC0KNmB1IOgYnawEjKB8/sLFdQPmIpIEAgBBgBBYgaGfycsnjf Wbns97ifhiG2EBCWDgy2ezIOCZuHqA5BgwLRyKA+4Xqv7AbzcoLwN0kJISROTgchKeZ6jxOn HiFbOHphSiC/M7zgcTjdHXR2OpkDQyYUTRtMGpqJg6ny7x7z94cGZJJPHuMqvHLNsQ2DoAgg 9O0mQ73mRDOEkkyN6HWWBpzXwDQJ4Ac0dmHB6ctfDuqVKKSBCCKiFK2gqqqKeHPwPtSQPgng 5eYoWJfmBRjcfdnYWcbdaqmuO0ra1msHZiewmxzeeSwQ3A87MZ+43WmD4ehsvb3eo1xJA3IM InHnDhlwqU1KnZZVVRT0HNGah1fZeg5ojMF7DCYHTd87VZ/enoTJ007i8weuCfKRnRORnvKL 0mkjJDQ4pfFE1NBxy+J46AC+SaoePR8GdeY0Oon6yB1yZIWd8JZT7nUTl34mkldyJF+W6bVG hyqDFTEtFAYTNXmwRfYFbIjDIDgnYrrymbkZFU5Lqm99C85s3EIHA4NZkPCc3ObeV1IvQUgd IHcQRXdzWKEaMbM5TRdoEAKY2djORzMPCYadqM8s8W01KtOipv8EJGBrymTg00DnOVaHC8vQ DrRIvLI13BVJ5w74xMOdCcxg4z1H/NqTxaU5amyr0wXaEllAtrsAxGNudJK3XKzIc1rNSDkI gQz7KMMwmfASnDuHexMcxlJKzLqSwMPggPN7OCZxMoxoDovUGE0vTfXCsykpezjzQ8fSc92E shUokOVEIclDI9AdqkPHLkNOU8QNlA6Hnv3kc9DvSGK8Nal34NZEVEW3ZRa1NGCd3cXPn4PY G89Ide1Mw6OzfFdfHR2NW7GMSyeBPA8PH81v7nuneaJ3WmXkL24iLJjMLWFQzGUDO3JbVqSq Bq2Mzag3O0VDPJWznJO7z0XHCC8nh4BoYSSQhGRkIRDciPc4PW9DKlNAvlfgGoCLvBEPXl9b jk5O7mHU5hhXetSjIeB5ljoqvlnvdvQ1/t55u8mEiSqqoqIoZJQlAkKGEpk2a4MqzOoefjmn meAVJNR45E5FlPn4v2EN/BhxU8mjz9NB31tGwGxbNquWws4NWIoNVdmJSQjYTwJOwhMJV2FY kJJL7y0fLTW07EwxZmawH3YE55XaqRKWIs4EeLwLCezE9uDgqXxp7uKk4OjRJNAORBQQrnHS NgBcSAK8CsLczDhQyLqaqA2oHHMSME0JGIcMHNjwYG8NDty48ZEgruTR6HyTfd17/jPEZzuJ i7zIzKqPj85DaHCzYbaLN1Molve+Bp5C6dA19OIep9mGlrGowUQkJJCWizQZi41F4S2YI21x SnZa7NtIRDBXwh0bcU3ZRBCIZUMmYyj18e13ri9vDvynoUYqMYQ9RsvEOD5DwO274h9QVHnk 8EPBz7/f6SqoqunjPauke5M9wJYxQSQDWSjc/T9R9jR4o0ofe++faU3J6/nfU0WKFH8R5AQZ v9YCEEW+gb3HqPXJyRE/3RkCHJnt+iJ+tOzBMQQtB+w35X1wBByJMkXju3cH/0ZDryfk8e2i eodWOS8+93/dDyOw8mYKDxwUoiQiCsIiC583o8w9ocyeZee1TP8oBtDPALD+WGNaC8ViCJGI iGJJMGxuRCIm+SLAOZRQ7ppDQyTquQCmQYm9rbzazo/nOA4LMlaj0LjcmNqKSP7ppk4GA7Ka TMMzr0dfE1JhhM2kiRgkAgE2p05Y5h2Agfga0jdF0oFhGKCjGRjAh9BzZeaZHHttm0Gw9oGT ukh04DakFquJblGRHCVxCjkHThLQFUHrXlBdhHHXyT+509+14U9BsNyzUIlJu897je7TUwO0 OBjDrtisibplhCIEgup3FdMCjiiPKqDremiNuxIjCNecNq68yih7iDIkEEh5DYp5JBwEB75n edWDo+nKM18eIbnAwF9wXrZlB3qthsfPrffMgbPc5OCGoQQ5EE+geSZ3MLskOJDBT3nF9fiY DTsSzJoQavEcYx8Uk7p0nga1YWMOBmMwNcBnkyvzVVAEEK0IDD2nfxfKaF+0xvJjYOCZmZCO OpMzO6DTIXJEIHMG4JJhDLPqA0AqvPli8aPaAYUTOESg9XDySe2q5plTOjzSSSQIEbOuBjhs tLG2dvHoeHPX0cpIphSdDbtxTonMwSl1H0BIcCx1OAGDY5eQ5rBDneZCcSyg4n4zLFKMSng9 tqRQnGWUVRi2RFGCDy+J4JgRdiXg5A7YxGXiIqZmYiIQGXXk0xZ54ILB7MykT3e8A2x5VhqA q8HmZ2JoPOv3pTELDkTbjPnGHuJElO/c96zB8CnNSoRJjYzxsgw8UhiXGQqjn4mE4HnMr1KG Pj489azpN+k1RU2Hf14kD01F11RceuHptHk8dqWGVbMB3YOxQMceQbbDBjAdcZL7MFa9VHCo 8SJdhTVTQEURDDJQTCQykzEMKYr2BOB6vMlg6GfdRkQDdymceQduejeuRdsJnQhe4yHsIo3N 3t3TAilp4925xrzrXHZ0FFhjTGeD1OhxXOzemXQhtOnfN+RaRurI26m0dS+cOYptg9QDsGQS YAOxgO5XTcMqHJIhe0DIcGgai6ZT0OTQETIdPTr6HAgwOkDxxhnSeeN5RiC+5jtPZyntik8Y CdDqxxi3YS/AD3+vCON94tTjPqk5uLWKKTS7sIQJDO2GBaNhltksV7velfIFUIslOgtpmqXQ OR1XmnF1N8MHTdG45iE3cDtFk6k79w1sCeqlMkczGKzmizkvA2MPDTGTXKYNgcDoD70UkTSQ elCWhbHwwdCQGgIKLiYnIWZQiyDIYgU5cxls2okUTWSrFK0GKZlC7fBwhPqSHQoTr34pixk8 S6nrkmnpCPQOiONOHx9POOMSd3eO72zgtORwhCCxx7vq+3CI0tTBSox1ofh6k+ee2A9OEPRD wo5EJHy6B7khSx7mGbSnRDBrGiBTkpQ9fuwiWqXG7yeOlhgxnEkvNRb1FBNYGfeMd17U8yVv PGGTIPYmE/qcN0ITHZDggb07kDiLxgLDVDZOcnMmLa2D1Dt5qIii8PffWHKuqdJnvFXXtO/v FdolJ1gaCoZSjCcXhkKAtOX4lQ8/eB3BKIdpKech0LG7VQOBKaFCWWjZANOtYpSJ1zBIhfIj FmuIHET1wgPP0nXSYiWtYrSahhg5gYEnYf6noCCaNSnpk4h8hNzJggAVoYwKfwLzkZWj6WCv 33b7d0iFK3cj3tenZAqp1/cG0olivUL9Ll8I+b1wRKgcMtk9wQ5Nl8zPfLPYkgStMjCB0OnS q6CTExKYiJpSJCZoiiiXsUCx2uuqn0KT51APRi/IQQPoJ/CH4e+ifBtNV3I0HlU3DOUvx/mi cF5Y6nVnAC4BtPET4UkgEilgO3N5Uicq6Y85OjIPw5ofog1f+6WfDii4XMHZD+ax9P0lTEiP ohxILMByAiZs8taQiMzBBmVA3yiH6f85OgNSY0/RkpgMjJNURU4EuJ3ap1hYdoyTf0X93ra9 UCkKwYzqONRc4Mf6n+vw0ROp0Wl0aJYwCoXFTELDsGyyqcESqhZ+j5f3P5nCjwPh8HMc4mzZ GCRA/b2+/2k+bzk0dAkUiwkVqFJ1UbfZQbhgnEzHBPJDuWJYaVTwVhMjw+aOhFPYmQTOyH3y DRG/mrhOlLJ14QeTJcCPuuZLlBfVDRSH+f/IGL/Q+6apoe8bizDXMXaehDHOH0vq3Dyqro0V +12es0PoW+MTdFMiMDsv8cZ02hJJT8zYkAXFN5OJvD4yB9T7AkE8IIlaumjbdXDjgyCQhEyY hZd8/Ez3VPGs2DCNSEUKPd+0vl2Bj9TCmc4fJgw7K5HC8mZJZqIcsSZMztAaRYGbgLXCWZg0 nUfgDpBo+HN/ChFSEE4nIXY6c/Wbg4kj+TN5wj+z9ZkLjiv3CWropibEcGjgYFI/2NTJxO29 dvY5DpDTBVWZTu4d8Ewa+YWHnOD4gBtFGAQQLT6Y+HIEoiJerf8PBGzKVGmFBGIhWgQjIIRo VZfUHl+cdoeY8/y+y1uPAcZONt3UTpugwaFLZKVMrW3IcF6CUGmhHJhXe1NT5lAZGIwEgfED QDwSK0PQDWenMTfV6zgTRLA5AGEg0fBOEmoXGClggT8Up1JU1S1MCFG7ISQPf+DhOxGoPi8k 6BrzCd7tDU1QXzoYKhgy0Ug6D4SmFCtqUYNAXziAxJLoB2cTLuZPpKwN2YBoFscRo400B44G 4OtIPpt5A3Zh8c4BjU1AR5hBUU7w/myYVCIuiKJ0H4R+YYBK7EaIXzphIbip95AT5uv8uva+ OH28ze7BfUeCei/DvWkeOtVSEF3y5B9/DEgkmQphiJInSSGJ4wUxAvIgdIdwbHd4qHqDRQ7K /IUKNWHDxQZmZtwVWgXeHkZjxnG/qYziEkhOI7PneyBiecKyFHCLkDsqFQPBD0OidhgQOZKS gCHMQS1Lrxlk7a8k68qBxBO0Qh9vvPjlGyltbSflGb8rOAjgsArZVGiIbuPKh5iKwIKh5ast E8gSFFRjYWXi4LSwSoNQKhRIHYPM8+HI9wu1AJFVdb9OhJhAmxEw/9WM0+k3MXiPi1Q05Kgo D8KRjKQwkwVBAs+swwkSCSF8E8D6sRcCHfQ9DakwCSYhoqGwEIyMMMaGpkZYAr12oNPnfhRQ /VChASlAKlAlh1FDiatEgAhwJR4IOLL+/eHSPvAhxODCA23Yn55KGd/W3qYxL7nBea/CrjFE hEXMkD4fLj/QCJoyyAZKwyDxsycFFgwUSzVtYTQN1UI/q6PwYbfYU3Jw56fw0SeQGHkgUKBS Ck95vr5sDu7sa/5IzrsKcwikKmTVAPDWM7QOWOOIxzTD1MDozEQuJWoLea7nTTaEwnhw7G55 nxDbUBPk5GSQ4gGFVd/gQU3nfDnGJ8RAKWNaAG0ieT20SieWsFNXCBi8UPUAvYMCSAKWPvWA UzBBgGIK+JKAf1SJxZjsH7+zrkQzsYQ9uaOyHxDyaVHZ/FY0A/WB+t5BLSiFJbE4LOqEAtML tkhgT1hpyLv/T3fkSvPfind3cfKaDogCZiHKoCU80+LBQQjGe3indDZua+aveyQQcG+NzhE9 aJetOCgGaDIgGhvr656OV9VG2TlTzU8PMLwS4PsgUkh0mZ7ufk9ok1JuOlfkD2SVgsAYIYGI k/qF4mi/t2UmI/jzDVmBhB/GGkfpivakpEBJKnN8tCF8arTEsTxEEToIU/a9tGcKiWxBgED2 0bDOfbi7ttbg6ErI3Kb3WMkChNxx07DOhrOiS7CcJckIpCEMk+SA2FwROI4KbzFlE4hMGOxX agHYXgSvuJ0S0RPbBgUQlIahVDugQwEnpYgQ5bX9pyIdgB5Xc/NbKf0/hoH66H1nDkmHRjDo oPz2gnBX336YByghyydY2OEdNaWwFnhYk8iseVOSFRut/hgkcBvRPdRq+3MpG3Wik7daGQMi Gmn4bAcRE2+FOIxLfFhK3IbGosNITUmlUeg+M+LsQzR8Knn+MA5cG9IM5B+/9h+Mjk6mUZAv ASGRM/1/7K6LkMGSZmkczx3oe/l1Hqp9hgsbf6in4LqeabgFRGPcIUlQNAKE6Qh8ne7C+uqh FDuh9gEgDtIX2yq/hRj71Q9KQo6qKFBAwybtDx9ywNUfEvBeEPNpvtCqkQssGt96JDuUTWyt a8JlcoD0BGCakE0gAm5GQA+CBqTMAa3/YdwWGHN1la4AYhOzhSAUhxdDNE+Mnx64wjJvLz25 Ts3uMoSTGGyRSxrmFLQQC6K5m7qnIMmEyU7KDItGiG7mDPWB2G4ANBRYbAUo0sqG5/ncO3Ti /HZIeaRswIz68+n71VXzq7cEgZMT6VgyioRgNTtMNHCdyj8UWGBgSlIFCnkmYnpPEMIiInDh loI6eZ9tCfur7+NrAPhPZ9BJRJHZCBpjtX/JG1Fe/4DjLIcUmSQWUqUkRH75CjBuLE1gWXbM rNczGYlhjVBbWFFClRlCgVlQpWNrSaCBvHHhDIB1+QtQSuvJvYR+BlNNDXJe10FyiFMJvbWj wIKYI7KlAwwwjr/EeYUPF63bpRpGTIQPVGgQKhApUsvusJwZpN2g+DuhQPJBHzxZ1ldtUEZC SX30VVFVVVQGNqDttTUhqgwT3uzPBhg+unM06gjUCEvjBOCw1T3H4GuO/2gvp6yg9fl6BkRO +pygRoKts3pIEmCUmE6HBAgQklBVBrQGwgiEKlQUrCyDEUQghRJSU4aUkzANj77AmkTR0PKq qYiojqcEcf49nimwlfTge0l0wEEn2wYMCSHmAB9797R8PGBE0N9pY0Jso6tKOcBnRGZ1UQ7+ y7Tn2KBypQF5GYMNBfEOI57FvMd51X6fJFOKDU4QeTAoXjm6CQdBCQhvX4wx1suKPiVuNoRg GMg0wdYh3RXj8R+Xk2AnsubPizEI7JFwL443A8xBOoECg8VLQkfLPoU2q0uNeTSWYmIQxRo9 JHD6JDKeWONG5dBuQ8k61e4ddGSaQjs5WG18x40DpI6yjAYggqPUxYY1MXyaOQVN47gwxQ8F 0B6iDqfm8pS+V3SLIHN0gQ1JlgFAiNJhZC00iBEJJd0AYyFAyqffPi3ohhjghYK2lBAEYtLZ 00S1CT9MSTRh9Pt+an6EStI3FdJLNaxo2sGQi5OHIflO9PK208zFMT7vv/OQhVOAf15CiRCM CNkCk8OId1CY/HDJ618Dze8Vs+zDEqDnaLEzi5ZJjVNgHTKJXiHgzR3HEB8CXFPi+bdkb7uq atspbQROu7k7kKxKxEG+OHXly5cxYVTG8y4juVoU0GaiuY09ceiWyDpPMnNdcEyTe+UOXgze XLMSs7dSlsDo0XGN0YPxq9iE9zOEDlMMnXlDz54TFZ5uFUiWWUjOBdAhjjOPGxJRFYZKLOD6 OudcAssDdbjWKFTu0GKSNpDDO04I9jY2Xt6B7h/VSWdoh2FfFxCbkMJRb/TgOcOGMIHAcGjb hGw8vEUY9TkgTrc1r5oU+JqEDu2KHhnRaeOBQYWqdbwOh4tiw77pJu6BUHghRPDFoVlDwKyy B+fZ1g8u6vkQyvq8THJRDzyWdmMxeHBOhcR5RSohpgubYNUi5i5ioZKDFll10NNjFgeGcc52 vVOHVOMee2wqHiNR7ZOkYxQMlWFoVhEVqgncsKNGKdoOLwQCxA8DxC92dnt3MSdF6YQLFNTx 4ukPFQ9xjpnFZSqNDnxtwNi1cV1wmhLCEhpVhnHHAUJQ59VWxRw9o0mdSS5AaP8jWeztm8O4 6hPPXW4lfUHRpTt+NWaEHnU4cIZT0dwTpoebhOHMJ5SbWU8wp45OHO4Y6G8707GJ6a0CRcyo 4Xj0weRHpVkJM7SwO0Cidhw/U0CnCAshedf0nwPXVnOTJi+S2ZGXIebwqYsFbvcA+EnMDg3i E5BTKx0EwmwGusavFibvPXTSEBeOgUvNIiZyC4C+BIYSi7MNWFbwwb1gyCObdEtaUWoUhSp1 yBScDY1nzodTs6gdimvUXziyiRUsUKyFCihSEyFLEkExOtexQfbAPsSIaz53X5ND82s+bfBr byxBRAmffHo4JypCQSiUaLaxSWl/Rv4RQ9X8U/ac5BS/Thj4Mh7wHrICipZSl8/nJkZD3hD3 8Pq9ee4jGUqXggMiAOgkiJki0pSe3MglQKDgB7PvGI5IGISJABArmsFydQOpC8UiFRYQDCRJ BU8AMh0Zy32KcGGJEI6ZkaCwTZq+HHY+0X4VyJJsE0TlIPTATcNwh1GCB6TpPXbeHYZv+0IA PBKYBBuu+Vq632kk9MpeuJfiT4ypqYmJRt8VYjjjyB52qunEUEgdhwYLGMtUupw7WYmwZ9ba 0Ur3g5EiOPdKELCRuXgrYtDrQYg0IlHJDpFTggdonuh5zymUgP7FFA+NB8XyJW2KJdCY+xuY 7DjiFZx7Nk6TunQpoipbW5Zzs6oKYTo6izmxMgcedkUl7OF8+OjZeuDSY7G2zrpFkonPZpMo NL0lVUPzPkfCef2nzvu08ns8TxI9zS8pLXnGAjJIdUX9dkmQNA1KYVnGqtCh6PPynV2odwkV zyGhsCgLWAeic/uEEk9fsVPhdHCdhKUUhQDCEI2u4DdCAOFpNgMjc4xmZJkQExp7COgZGkvO gR2j2wEFJ8PaJ3n0Q7d0IHPxwbvA806e8R5nvbONnGeCwGoPsPkeOU6o6JvzEPoQYDqrFCQD OGGJ4Ya8xTheutW6CBIk2/AoMiqx69sc8kocjqa8FNh4E1EqNA/gkGjCBJlFOTggoWlJnuxM kk7OwJIBmBiAiAlIAYgIgKClkgSCAIIAmBkYEYgIIAhAKgUwHPQyGYQs0Kv9zw2D4iJt4zzG B9YIin12wfqLTXY6pwUOjo6GoUEBSMY2lqc1RElD2nu8E4dnypzAyCJKDB+jw2aOgknkR7SM IoxqteUpxeIkeBuhPCUK70bWlcfcMuj0x8nU6D4t8eCD9MtYwx6pO9H71GrooR+Z3NoNkbgV +/2vfCrE2vaDAo91GUXl11veQ6IHStx+fhKikztXChMkxg79SZwO6BNboUaJtFIrhFTahWtr MmuY1u12NRjIVofqONc0GRGWyXBuUM352pOdtphAUjsdVBYFsCDS7xhY1s/NvIhnFjkN55F2 uhJBXFGWM3AmQhMI65fOVSavGoi2dFSEK0PhFqsRvOLQiYG1dASEEaprUwM0msxw4QwZ5Aw6 GCARwjvw5PQIDtOeGOdh0xtDDQuz9kaB0y5Nt5c1Vt0w3lCt9rTQw2YagHJyaBiVhI2iGkoI WgtNUJpEdunOpzenGg9B6mOBeGB4yIkQb7IQOuE6lhopZOxHjhNkiWcYjhHIbACaShKoKKSU UknAXoyvgTkaCTiqqqqAdWsp4PFITryidngMKFTSY67mMm0XeI6xFNa3ll0heYSSU3rrCRw+ E8nAnlVECLjgkoDQEM6CTp7HgJwJw3jY89YUQ7jIGmcXCUxbl4pwujduQI4oHoVehvsYeFNn ZuCrfJlx2thwpYX7kHEYTQx2LzvuuAnZ2zJwvB1yxlhMk0pMqQ6Ig6UmhP03hKODjRGg8hNO WAjiOa64WqTTUnlsfLsO3gwm07JLTtaGBINCNMceRENZydrybPGQMZDI3nw3YsF21vzXc6as cYdmHodr8eWJTJmnsPycuGGmWpSwSq1wxDNkZQmgQF8vW32LKaxOmBFHUeSgO51ZjBLZzksv BDAhMkTmUWcyjmiTnHwduWeYZeTGuNOs6G1HGa34Qd0M3CZu+zEDoYGbjnx52cHn2rsPmr8m pjYcB6mlhw2rqTnRm+BENwjgY3GQ3U460sGCTPDabmZY1ONedIeRGjcNvRjli1seWLSIXZwr TS21njgqlc5xnYnZ822yTbAeTiLuGtuz4NngffkdoZmgQkdJOQLBLMxiB+ECHR0GyW8jA4Ug nv2aDyzlt6u20rRwmsJHZn3WC6Glg6ENbQNTu3lAZfrNvbB1Ajzy9N3DnyMOvAPfe9NVlmNQ ujzbT8Bl2zy+B0zjkGUnHQKJNdgxg05hrz12i00pdjjk5UQPJ5WQHAUw9M2BDdacMCwzpjoe DFeMTTcuG03Y0zsR21jQO9U23a7eTHA8YBHCDOJg71lMSgJOwoIZdioNdrdMnHoUGxOzWebA VB2WWw7S+cT4pzYh3H2vBs5groc44HwmBu5ZC7ZgiDrsO5yCN+EWxU90TJuDWkx2gxvHlJ3O zmi58Gs1WN4Fgy/OjSbxMlE0aEqdY70qJd3xA53IsSvvh5HTc7yYmCb8PR1zkxXmu9IQn5IH M2/AJeHjRTFeZsydNlZCKxA7TEkeRrQrcsL8Dk6DzgWdF24qMivdFNMycOpMXO/lNq2xjkYO h1R2jd+BNDJJCQJJQzmfAdBBI0oPMXftHT2k6H4kJOOyo502cM2GK86245XDWYk3tbNxPOZx 01IoczJJHWdUTngjqvGDnJN8s3LO88OTTGHtymfGYhc8DbliN1UTDtFSae1Oam0hxRMJkQkK tQZnflb5vFv1sIY5WSmdtkUaRxoxtSZzzI2SCSGV3EU7aauHfB0affdNNykcQkQIRQ5L13DP AMqUk4moRY4+XK6vc1Ttl3JN1nnOMD47U5snRBe97MSnvZNaFzfbRwuDOE24EzNIepA3frbH Oet6hWiTgU7Q7ydqLUUBKGqBEfRciE1BSfRCBlSuT9Bffq/IdNm9qh9AjCQkJhrv8dzZLiwT H5acHcdlr6od3xv9Ep5eF4uI3TIlLzAZVcZ3gkJxp57e3gETlQNmHRmZmkgmaODXPO3vPJQv Cor/HS8vqABSyd+ticIeUIWVHuwnTntAvuNuMhxz7JMz4jqr6io2LVowzKBJwd3BBCgMEswN c1gzE4s6s2hXPE7tDs+K9IvafxeXtnungQQUhJJm7KDlkZt9bIjjDGm+yT89YZay7OLwtI9u UtolpZGTm7OcDPbWuPT1IB7EhwOzlO0qJ8kCcNmJyyogzhB78UxJmuBGdKxE4UbKLC25YJuY uRBzKeqp6T7ongh4StocT0ixTtPHn6fxI/QUfCTqlHYJ5tl7Gg7u5gQjAwPkBCV3aePyrpsw p2jWyq+UQqzd5l+Hi8D7cdHDTGLrTn25ngZ4F4d0DbjxoDDzGwEOXyN8Ric3Xk8wm5vDJuro HCD0RCLYOzu4GtRDpzpE3dXgGLuiFBKkpsIroZ0YnAYXs77JNoFEVHMZNHcNlWhBrc7cCrYn TDzsZ25fjJsQ5TNRXLFu5J0b6ZmF0caNNuo5GWoOXHDWYc7UId8/aan5aC7USS6sg6jLsiZ6 WS6jCdgyCP14mBxAMRkEfouveigmjo1gQedIK3FSwQMkCsQ2pkj4j8BkCTSLLo/rr1NWRvM4 H4dQfeOSlflYpNDqE/K9fgUVw9YMVjGMduDcG3EQScyAlnn83m7C4w4OfEUhvIzb0vj5X6ir q3/XBDCXmpUcBKLrAIdbC3RxxjSEiQZAJCReBxF49GxjOPcsROW9Kx5zFu0z8pISJn6Kus5f ORpoEoUDyqJUR+IrvnGJEciegX0wZRo2w4OfRBCJB5aXhsSZb9ODPBVgp8doCDmnaZ7EDbSi hBad/qO7k9AOJqdyDtJ1TlV1xCmkI+1tyMB2HHYA8HkwCZ+phmMEDAXLnHHcicLxDSDBOi0e fOac2cHPYec8GTO2Hgk4OlLOrajkjA3lMWmsakDViWt8kSQUXqS2cZRbTicC2YwxxkMZMlSD BnWbos7XodvGaYq9PEEGh55u5JEwwjLh1nijGclcRyEM0cYk4Yhx5coNpjRrAkU3FkWEPA47 RwI2pImOxVIkwJii96kHp4c4oAkvhzvFNQBZpcRI8V5XwRw9nJJg6IYHRa6Zo6khr1fNy5ol 5gLkFr+CywhzAgh6FHAMBxEvPrDIdT+JgBSBzdlr1w7gspUfhOhLgJk3iwoiPeokXuF+H0+m Enqop9cIQonLky4vGTFLRmnJFhX2Q0qilWoJ2AH13bDh7pD9sEyGphRvaWEx/ePh6D5z5vgS wsPVLh/oAQ8D1+QEGUJAITnrP3vV68o0f2nJvgtrr77hgFkuUEbNnG25HkkpmIP4CKdqIiCT Qx5MN+NBCDZrifhBZ63ENlwtwsAo/TLM634/jk2FDpB320sQg+AAfxASUKPtft2B8dsQAeAk TPqV93zejhA94EPghVMqFKAFUUBX2BsFNQwDCfka/oHqFJ2MGKLAUYKVmChoIlKiKUpfl4Qe 7QMGhsQ/ZgWfM06odEE0vfdUIwogV9AwewzSob2KBhKJcc7DsZ8oggKmigJEYloWSBPH6U75 /Ip9qs41GAyCY2vIrCBJoQ7cS8Lg7iBQ4saCTHOni+/v8Nleb3baMHz4LdmI7ImDEmKqymqC R1AFBuCIShKLyRT7ff8U9tqdUQ8BnjqjLx6nOnatF9OWtpkfCdIFTcVYUVWH4iPlcBGEkCOj +ie2aQMuJrMJ4apW9le06LMwh0Q576TsPmllLTDhFrKWUs0EMmQUDysGFo5EN1KHSD45M15x NxBH3oMaSQiQmQo9QePw+wuXr9cymutwLPrVgY4UZFTvieRRqPwU55aYDzqx/SCe5VPSO+iR 6E/PKsQJMwwUpIfeMMnMAQFzYXdKWeICPSo6MJGGiiqKxaQnlZjekwiTEVeBhOD5EMpd6UJg 11dpXsyh8IGD9CHRoD1clOK++JQHaAhVxFsmmEBU6FywAdpRyN6DECQDw7l7uPzqsYPLOemm Vf2vVThiYYaSRkQJAEue3Fe1z9sZwMGZUA4EIofZruZBgqVm/VNGyHhAbh22DmIZqE3CMq2l wK4gNYdoKV8EsY9uKk5YYIdsjju47O2wcE09pOXsi2REjtjmomnYhkzUS5POS3XW2AyvJM2M Yy0ZHCSYZs0nCqp8u5sewhyyBBTDES3EOAmlmcSOka0+svjI8aHDpdhG7esNBGMIUhxKZHmE ix700MEHfL1LgiGRaU7rCXsnpcHhatYXlatCEGiJCICum2GaaZJN+PO+QIRs2pTrBs4cuTKZ Sa4ymtcywyyQdFnKOjNBprYYybFoLY0wxBcEhTBhiWGl0BzQeYH6g9BtUNVhF6TWdIRC3pdg dDEZLABMMfuowne5zCzX+cmJzCqFZaYTwxwQJqY+ZAR+PzETqoeo/ojn9AnYgT+ie6P2wP3i gimCvm9CHVUD35B9J8JSPkdxTxI9pRDJK0KU0CA9/voN4+uNRMwVUuID2EHr+5keAgYTfswc pMIl6sNDMwfEGOSmPvIP2O3lNZk256RLxE/uxXMWb1cfV1q+O/05bgkHU63bFErG2k1uGMWE olsMAScrOMWLfz0Ui5FqYDe6q+AwSV2216s9ixtZiIs0c92bYg3yZYz7p7ZjNaBHY1qB8YFn 0NHPftlbXLDBw2JmteG4rw+7O9vrvT3jfPOuMo02cNrY2XbREGA4gwSdHmS8nNf25HMLhz06 zMt0hcPiq5g5cxz3iLycyxL/1+4M36QyYEzQE1BEx2IF1ETn68ChoSiJYgWZGoYDh2j7HSne /FgPRCKCKn8cJl5iQppJkiAwiJA4hSIUBJAIZcRhO+kshwewpOBkVuXTpBo6aJo6EdNETFmo WHsBDx8o+yIBwWIsn2bopRlCBqfxIzhgsTtFjupjpOjqnSSiEYw6Z1wpkhjlFzBCEsvBOyM9 GK8EoEShyQfg6U3oXAEB20TLBMuFFsbQJEIAyKRowInFD1fd8TKgBxEz3xGkNTzfpFTAGaF7 H3w8/xNJ5GVrnBMAlFNMeQiY73mfl4afpmgdV9fbc7XKnk5uohn1MXlDEiFFCHAgj70g1EIU U0HCfcD4gD7uVfSqdr6uq+xIlqgqGEoapFSFCAAmACCCiIKqYCSFmF5TvT0ecOD0mHcHnh+s OPKQieE7QAPaAQUjS0gmheR8pPBRIagRB92qE4CIanwRfAEduKLhxgftJ58nQh7kmUxalVLZ KAdB16Q/WjROboevpHlQ6TjD0fjgUBGVV+sSkcKgyCsIqBlH1pFBKbV68izxNQ0DTkQ0no3o wU4dM6ZGPONKiKcoWO5rUhkpw4YcLJxHE5EbwxiGysbowTF2uE0Rpw0uskCxjGQWII2igsBI yJbNdBkxGFgI9eMfQkAo8NmGUaP8117vwaqqO8uSISGxMy70nU3sMoVX17L+qL+fdgV8gQIE RPH4UTs0Sf7JP6Z+GE3epUpWiT3oxUL7r+UoyhpRghdHYXckDIctvvQLEzcgjyA5V/Xid0Ax I7CQ9p5weogmYzScRnuIMaihtAYi7GXhDhP0EdV5OUDof45ASvt7Q8QOEDtJioiiQNC2LISG gTRKPxIA5RCkQpID7d1RBzdBAPKAPj5Uin3QNr0iGGREmEmAUNIdI6hK6JCI7k7lVZFIIMFV 7IDsfPJiASJ+3v5Tc52PMOofpT7KaCAfV8FVac+AB7tTAJ5kSMbpJ5lXv0ddiY1EEq/Oph3t ha5AJAdwOjR0gPJASVODEAPQR+NeiCZ/lDh79niPEaK7w6FAHkMdyrnRjuvNYqEJWqnSh1DA Pug4sl0GsTAqCKKSoiT4gIV6EFLsl9UAdbZCryWSMxWGYcqEGQOi87CRIC4wAB0gRyEGMkTv pbiEI45Nj1SWLBh3ZdmEsRhzmhhU3WYMHDd7rhiGmmImGiiiED6FjIN8Thl2deOI+EhWShlT DKVtaIblDbaRfCjBgStgCI5x0jX8YgnYKtU8K2bXgQohPdKjKFwqZEstlpCpCBQGVIgx+ogb kS0KD6fzUgmOZrIblprFvKnApYSGUwPQewzQRv0dlJ5YREugsiiGEGCEE12LcnLtOwXtH3SA ukwPvmesIQhRRnIAG+EHnzIQCu+F3CJWSD4P+SBoAuRjUDkRMZhgJtH6w7uVPBP1+UY9LuOo pL2WMlLANLWWAwZIGb/t0grk7hB3h4RQ5CMAA5+3u5tLbySHLwhwYVIYccZzFtLYpRDjoKk0 YQuoXbSCySleSQebg4IDai61gQEUpQgV0pCjKkolQwMDGrGo4aU5Ap9wc6ZmS2VHqm1hht2y jvJSVh2ACImTK0lBjJ+uGYByDRWk7ETBljCSCElJQhhJY4OB72Ij7U0L2AhFHTzV0z7MNwL7 IJIi+QKtU6jib0H6E+jvTsH+CpCqQAoApKQApQKUpRCiIaUaQqgAiECIUpFJmYQiWJQiVYgQ gQqQSgoBa4hAyRCCiFJmgYkpGhEIg8tp7xA8naV6HqwSRRApEAeekChg6m/eD70MB8aoh/Ec bJrYlCjGCWy1jhFh5tGmGQfEEIehKVozBghTZ3YnW5BhCENHTS0ocmiL+3FFy858xMxEiGfT 0ReoPn/fWUDq9u3IaPPpTVgMLrwJoKaUmCClkhgkEmZliSWZlFKIgJEYiMFBD2+P0dB+Cee8 /CYsNZ4R7SnKfquDtCInBMCPyG9hERAWAxID7t8KD4+GugiYQD7z3fUcBMvHZVDZAPmZMDBD 4j42czeshyNKacCIVyDxIUHUCushMigoACqAIQgTCDTDNDJFDEyJMQAWA3SDqkTvgn5oOfzg VsJdAemjI4e//B5Ki4640JN3fjOU3dnU1nIhBMGP7LPADvBJCAJEL0QeaDy/h+WahAoVFemY fn7HchHzYYFLR4eMBgnjBhAhZIUHyEv4qAn05Ht+FEfq6fnXOc4phDauswS4NUhxiB5OKQ3j mHP9Z+TCZhnhWTGFqIawdyH3ghg+9dTU/IlJQmyRwQi3mEZaZxP44kQkPd6UQ8NeCJlDBY4L LI4IuBiPsBwUPnT2xKJxT3GQT5msNglLqd3EEw4TSBMyqkE3i1iuyWRQCBmy8qwFQSoqhg0C 6XQ206tOoomobQ21o6ooYgBIAekT6RE+9kRUoDwDoYT7Jv3HY8/g+unnD8bEkE5TotT2geQC e5PEJNA6JgglUwmUi/NPoL+s8gddFwwx2SwSh4pZqr9jYTNYJCiQofn86bB9R+3yqO3cFm7c HTHzOHdtju7TmIvgB1iZ60LuOAnCLcFNKXO9Y6R1wWJHvka6nVki9p9M/XCn1SNrAgstp0kG 0MxnKT8hcPReRzP0/naWlKcknQ6KrB8UmB7QhWHkKoQmnOAeUrLihw/heDh0daJ3Q4dlbbG8 hWEMkUMlK2cGcROJEFU6GUWKqMeDKDKlTI6mNzDMdnDjqNsopbUGJejq4LDnatnV4rznIpuR sSbd7pUfzdcOusIL4pVyJMFlSWRztek0a9VkXnVJVC0xOmybHLWj8YnABT5wDxc6ptwFHUQL gSHuQke40etAdvA9IVQehu+eqO6E0SUr7OgcEvtQU4cqBxphCiuM5B+FiK/AQfqSnRfxkKSi n7iVDtH1wAncnzAj4I+ZUfDkIPV+0O6AAfwiocUgQfoGowhF+3oL6UO0MVsNrhOAdknL9Q0H 4oVMISLJPnHep7nRqu93fkgyEiWhkHkMgDUH7iDjWI6ZRfJjj1D+UB9hkEPXb+UmEgX6+mmi D2B01YQMEcHIpCgA0ixJQHMYThIcGsBDGEIkCJJUOg5XpLy9hQkk4bnQaRKfWFhO3BEnalvA cj9bgEI/HZ9gRPR8AWXARz6krpvbeIh3YOzrNgILcEEPYRCi0lOdvWjxKMzC0QPLaLRikAqG oeRE2SNQCYrPHR64IcCKaN0JPeKT3joKVwQGqlBUdDOIWI2L8fdy+w688Y4ZbIY5fXbhwT92 3SD4/KBQFSmdMOiXcI+5yS1nGC1jUs/z8H0saewcrtl4bTijQnciRieAYEG0xMqqpt6QXJ/N THPI/pcmOXnoIHn1zke4vhqJNCYPvD1NR7+7Tno0xyI856CDnmsy1Awc3BsXORTUB793nxvr D8DudQ2+SaRYhnIOTnE83F8UP4fzeDtIeRDZcl+u2B+xWeOMuwBM5njwbNmMEMkinz5riyxN kNAeZNwhmukbg9Rc8YQ0h5k8s0c+aTGCgigqCHzn5uRp+BEPX7fp97JD6JA+34g+BOHz1HUP cp8wS0iJEkwLpOsvpPWnU6lodTNKfh0mC+kBP0QSpIkgGZAyBiL/IZO0mh+P8WhkiKoNoVjD 6TbCBfxyASw8wGFJGisRPNpot3RYCawNalDIqGnFQ3JklHpFfVOscwEO5CRO+BO2A5lPC4tJ EnYNxEbQLH1TIBqyQn7dhkHVc+1PNVOoQNiZODw2+LDM24SEk2wKDDdWeyYCalUBMh+LyD7q LAsCCIihFFGJSmlSJKEaClJgCkoUSlkZFpKPfgfi8q7ZyKl8MNifAX8/PmGaIRJDqdtAfJg6 RaexYUtLwxCdoMxzHLy7uc6wp1cjItccATG8ATFxCRflF9A/P7O3ZAieMoR6/Yg/cQUzyYeM TVREQQBFQiUrQEkhMJBEQkQQQDBIkhTCRNAhADExIEUDMUxSqEQDCzEQkUsk0QofD+3B0SMp HjY/Mcv+GmUKmFrft9trMMNmkK9xnzfMh1fgJSlRhfswA+ZnP2kbfZMibtU6Tt5McSGwclXi sEYMYM5zsVR7uKbPXgpQ2HNogfi9Qdd2l9DVUThMqGokISpc29QCzHEs6vJoHQUv02HDSiFY MIqafvcaU0bsLpOQaLKk77FLcknezmmksOxBDkMDoGDnTDACJ0+E9QiYSgLiL0IqIemlIcKp DqwCoADUwIGdFYBoIBcXDmcNCGJoEVD4HgeTtX4/VHlDBtAJvRRGPnLF9O8DRdJcKhpFNwnE BrvU0nimK+5MMWq9T5sNQ9mGbiZcZWCXTIj4h6e6xhp8pDCnDDDcGMNIkRYMDA2lpEURqrTH iTV8dXEqNnzUAsNUEPYnv7/jCSTNto1kkclgSQK4oH40EPoqH+sqH1h6dg9CIdIkPzCoeXh6 sIz0OswMO+2TuJwrU+AM8jQeoWh0amWKW0HjLDWfb2YMyeLcHFtKWIiy/3NdTdG1NLNKxxEM PELyFUGHYUniFAh1ZwU+WULJYmcnUAGHi3N8M6ZRwj6N+rB8TskOZOkiuWrkOjsNqSmE0iEC QDRS1AM5CABReTKZNHTkiG35qcjAQA0YjTY8qnoMQcG63JUyHog3va6UcsFU0HxJIgSbIxU2 h1MnAo9dUECB5qdOwknYHGc8Fzd6ePmQ4vtUNl4DO2SzSotcNSwbhwsjkqkhaM+4CgykOhKd +EkDWYdXGSEpoTzw5Hgk4MbRQJ6E8snRyHVw2hbTBykz7HRRmpbCeLbpOmdghw5ek6Sqogpt DMYQMElMeDAjBzho2aNbRCKMcM2dO16EdAhDHME4zn1jDqFk64lLtATz43j2OZTrgjNqd2x7 KGQ5EwZNA44qaTNJYTMuq3CkhSJFQTO2iqHeypHmmhBTKpVVI6CiZJlRcLBQ9HGHRDdnJToi FSJwl0OFDtow4OSIDVGJcQTHzHgC5GOVDMzW4H6dH4iATkB3NcDI4J2wfGEbOVVeBa/aIr5C Q7fXeh93yWJ4uG5h2GGoTAK98GEFIxEB3CEqAbI8vQZPOJ0fh9/YO5d5GIWGZFEbhNKtQfC/ euZ5iMRxX4bAqkxaUDRLh7lg++IXpRg0ph6W2uYnYztgN70nnh1odvhDPkfD5ZSlkvRTUtPk WulshTMFJrAtlWygM9O+lxwHgUTOcTny+93AfyMh5kYQymoCEkCAvFA7j5cA+IjtA8YHVM0k EBgZHGmldSZ732MTE1/E3mA2R/F8NgfEJPyx5PY47H43X0SWwpgUFxb1wW4nzn1E4KH550Ux 3Du9CiHOUKi0GLJDoj+i0hLqKDAhB9HxVCV7gIz9I/D5vGIB4+yiY2/BpKHNeiJIkKGhEPD0 LiITkFfJCrs4gmjqdGiAYFSSnJDbimg7qO1IgI+qfK8e4+Z8BzyzUYcJ6j2ziP1ER3QRNqOs OjLC3VSZ+OdgXiWQ2JEy/mA/vJKHo8wncAwnTmaiCGpEgNKdZtDB0InkUDAjBPkPCy4Ep7VS TqZw1gj/tEbYib3ZhO5wko3mJNO4z30AjBOTDDnExXUxQQOE65E0gqQUoalNDEMQsIMYKIsV NrMMtW6hHwS7PcVz6/1br/Y3iDt9Dtto5w9PycheMagmw4JCq8WrcgiNOJAEIL9DRg5+wYJR FzmDeSSm6BsCdyWoesOxQOT0A/JYbDB9R03hbr+A32KSMCIUfVDGSY0rTObOjGm1Fch+BPgY j6ju+6yz5GfIfETxhuT7DwvmGY4qOJgoMUijkD4lBoHxUNGKgWgmbBkSNog+SqlCg9+EIGDp V2OYuSbAHs27qNu3FJp0mUN/EDY7eI37/A2Qqfg8bd+ZdJ3X3TVGLB0FfatqkaimHcgcxA4J tIfkGDnxX47TxpRWJ/SSiCIDEFUEM8dPhC0iMMQKJzGKZxgaClbMg74R+O1BEkhROowcsKES wsGgoCmbeU5QiNZw4QkMEQcmJqQ8xM58HrlehwlBdMFsO3/d2It50LtuhQhBaCDzAa6yeAlS eI+AqiFYIkiICuKOySps+jPAkkQpTR6SD8fF/YHcI5bvUeW3R9YiG5Ytq+h6HQqEkt0jzTmp 0HYYECDpn5GxRISyhOFpRS4r1QfTs2d8N4CbAompCQ6VxZBTyQx9JpGTzWHGrPEMFRJOAHzl CO/tfI9bEhCKMnQuiggPZ7jUh1KOdtBjieEEHh2/MJ9ITSSwJC2+BPePyHYPYheY9TA3ieOZ oyKDIimZhXZAGiJkBsPAvca8/cb8sjMDgc7bynwh6Z1MiHP1FH12i0csFUJ0iK4pAqVVcSLU S0qgdoRFkEwUk1DuXA2MwHRO6IUHVn0SdfOHj7ftyasNhiR0p3xClhATaHQpSfFiu3Kt6ydt Vz9hcE9zptNYnDOuhCHHt+2kAYgbigI65ICbQbGvSHfhseT3QHq8AO/vPDMlvNFjguFREBAw /1x6tpkJzYLxVjgVA4BvB+2TXMmULy4oB9ZE8vVQOhAxPBekQKcQiV9KxAVBI+7z/HejZj6Q eNdcC/oM2gW6kKGHWBo/pwVfmXPo/ylKm/HJqWHb7rIzxmGg5jedTm9V0bUhB0v6hdCiC8jE YeVkhN78nI8OUH44Bfja0xuUakza1rQ5+mH1q/CoQQUfn+mc+80lt+dCmtjsULbhzqew2hu0 57r440cfv6KrpsaHaTIIGSL0AkjYCbjLeG9amg3uyJCSlSimTSDWY1xqRlSw1OFOrIUXEopz poaYP5SC6sNYxiF3k1qjOXInRzo4d2VNbERPms7BPMOeEFysKPYj4N55w6HoTQTgxs+jU7Qs 4HeKURrDh2mcVijJRWIyPwzQwtpze00I5Mkzuhzqc6nWOEGddYtS0C9XHVDg4NNDwGqB2w77 9B7uzCvRO9MuukChjzqShM+OHOQ5tOuB+k9Cbms84snGIhhBE5rAixHktyTsYZpQqUtRQOJK yhcmF3ENjopTCNGM8dGB5yzbuHxMzOudE4cvMI4gnNqbYVcD+ok1iitRBgHoTSj4KoJcjRPZ N34DUvsGVPZDdHtp1y60MCJO7WKLi0E5TmhVjK0ebxiG8UDjGTU5BQIYTEsCr0fMefaAdnXs 5IrxHxpktKZKbVzCckjFPKHOoV3LBiEmNihBJKPwyOoaTSb6fRhESUMlRUCUUtDTBIF4Q+9d 9ZAkKDFUBQW3ni+0i+VD7DHJ6R0iXdH7MFq840cfcvFDicwsLKPOG5ATUUtAfeGJKh8nLBAC eNZ2JKQnR33kuoLKlYTGoRBAmQBa8HJDcpKkKMhkw1MWgaMhQQSlkMIZgZojInCyhXOGHyCn BCDuKGDYaKaxDUipg4YWQhiJqFMlUkhHNOkcE1AhmGDx3PCOHnaTYZoBr/dGM0MBKcpQRPQz EKJwxDFgyckUzZJGld4bXYQ0gYaMCaEAGX04ZJVBDSy0NKQxEtNK0g8361aNGJAidgdehSJC 3G5Q0R1QDqqVfMy5GJ8UHb1iRCmoLg6P5EMTaG6gE6WaQ1UPvPwgiieqAn5AHkUWzsCiHoCv DKG2FRKVvvC3D/P9ikwvqPxZumwdjPpf+79Xt/l9O//1/r//fzf1/+P+z+7/3/2f+3/9/Z/Z /9/5fi/5/7Pf/p8P9/9/9v939vX/af3/3f3/l9196VA9JA/4axjBcjf4CDcCym4IfIAwf3hg dn48x5aAxTKP3AcjcL+WDrEV9qIB9UBj91sbZFkE+kgq3BXTAVMpSoRC2sykgDAgMCQWhVGi hKZiIAQxTugBdQOLLpf4AIEyBNKizhxdv+QNpf+HJT6fupRobGprSAYQwYLfqgCcCQqCog8F eDh0qgcGsYkbwlDP5eaQIlTSdkYHSVkONEkuB5vPJF+Cb70dy+PJ2IslYWB9oNA/mk8gDgi0 QNoEwsSf+KDwgyfCA/r8cO685eRLM3/BB8Dod5D9mmuZDwgdx1dK9TvoWpkSclL+qKYSw4VN VHByHvIO9RpALN8SkgGCb2ZAUzBD8E8M0BRQTxV/h5GgWINaxsCvFbZRUihATlkDmLEEWJqS dsz1bCM5RBrI9cGSylBgFxRXURxaWTSSSSknYPDhs3qULFDk0koGCv9DO3niRSCFAkUODkxC 4MFJJTYpKFETDIaSR2EOwdHxQ874TjERiQPNtweioyiccECDCb5xLa5Vg0nm6QwpBOOoL+SO YBnpABNrCQh2udgcBqRp/Z/TXyZ/lh9HzYftgy/pjpBYrBhSPnl1uL2BxAaf6JQ2DS1EyjRD XEy7M2Sp7jvNmkGohOSR7ZHiSmoXE3F0fyQHED3ouZEOXUp1JRFo1MHDG0i2QxEqyl3IfxlB nExAaguIpbxpahynGCJIgzLxWy4uIEOFZMcqqZYcwqiHzeaEowRk29ux4XweJiCRDIIjnY5X jPh/IuFMkDOJKUk2GThCJYFBHFAhQBdB3YDB0IOqFolIc7/HFf9424chc/+NSlOk53YNkD/Q wAwhaHOWg4iqFDmu9BOPSCzQ/e65+/RSGZURiGiTw9CZAzCTDRLUgQRCDNZBmREZSQE0SLEN BDJgQJiMJkH/k+jPRIPvSnnlcpuANRTtYIeT4qVXhEFKgBgeY6odKAX/hdHWOHd2AS/TvPYT GR/jOIKIM4yHRGYPhNI/6fifqIWSQPl+VkQ9ZhJEIfyteEPco8KJ6At8v8YKvm1756Cj/KKZ mqCHuPjIX02nQRH2YhhLHbqTRohCFQMgAvdpyw4EjqKB4/rqkFUIqFD0vt6eJ3EF78hm8c6o 9wHU6wFbTUiNFCfACrW8DBnQZC5KhiQQNwqtII4QDMhCwgB0XD7we4y5AwwzY8IbR4JVZmCA pGhMD6Tv++MURMVc4JkCFEEGtdDSnwRv0YYcvqtS0yRLUpRQVJuzYBFRmoBRAKWFBYxICzIP NOlDI3GtxhAnrGA6RW6liOyRPoDn8OaQXU0C8k8zmYSzyw8QMSHAeHvLEBffwVF+xgpA8JUN iRqwgiikpbfCHeQn+9BkeJE9uF3IPaa+ntPSyoSSjElMEyUkpBAfGepl5sgKKH7POL1QTYcn RSv7j0APdFI9vFCHF5swEIhIkJBgCKSUKapCgn5uRvl/75Bn3nk5W8wB6ka5DGUC8Ymw8UVh Cwi08qnxY+wiwPxqfS+VmDYwAQ2IEId6XTdJ8o94FLkkgEIv+RpfCW8oGqPSiEAv6XunuKeb QzmpjWBaT+tiwT5CyhoM4zgsPQZshQ4p3icQ0GLGbwfiEIUVFEXYwdJwiHLTThuuHBDBA1lB iIix+6AclmYxTgffx+kmn09mh2+ralZ0IQnBMxjOeL55Rj7nJrGhwWG7iFYFlZr5mao2zReP clHa6M1QIGBNE18Aob2wIEACTu7irlXFPeAPWQGEUAzF+PqsAc9/XT5yECFsn2U2QuQGo+gQ IsfjwjmqPgZyvPz4gp0ggbn01CMpED29BQ6rIBCLwPIISpIKgo6+oTkg+xiWQfeIX5yFMBIR kBLIPp2Xf/TD4qfEN9aHIU0w4MLQxrn8V0XkO/74p7vtp1itiRU8olJD1TzgXisxzFuCyfYS eX25EzydKOoktvu6BXSh4qhI/cDyU3obUGmDtQ08IjQhwPhRfZLOcimhAy5zugURHxMpsqul 5sgD8CF37iyjl8dU4QGQQhFGRDKc02PAZw7e4gJIE/1P0g1je74nZJ0zpiOHVFNw0Si7v98x eYDXOFmJVigXiAeqNv0fJ0NuAVLQHjh4+nlYaT5KSxjrMQgaoZ4Du7qPsEPlZMe4oEo/kl2p bTjEsKoh0v58TAMjG5ODFRexQ56v5MzRhgV3w0mYoDyqgZIwWTU1DWPT0fSBV0fMy0Vko96Y dtmfRb4zF2tcUAtnIgWsxYW9l0Cv3E4kRk+1KQ0oWG4WcRU9qdD5G/Ckqdzs53w5uGPd59mz zskMGYGYLM8EBOHNsH6slZiSHFh0nRYpMFCUfbQFsbyNBIglhCFDyQjTjSMmDHhmN4tKQoEX DZOmdCcs8dsX13BojHREONj34KgmAlJkrRq0bTiuNiYTBDj5jA2LhXmXye3rH5WYbIMP3Ax4 I8A2Ce9oJkXE4c6ltajB4UUXB88Dbig7zwihSIhtAg8DKUfi76oI+L4LWJnoUJo/sznDJDRz e/7IlvLPlgzeont6E7U7JIRRoSCZm44kaoPd2RV3iAeabPAAgBEj+gPxA5E1solipIZYkJkq koWkQl6BfQ1WsFTX2U2CEgIZ0LoIB4CWAchgc7oqtoWbvJ11WMOkGCPKPEvwX1HZPvFOqJwi kgfy+H+xRRaqiIvNphIzKqq4BNlSRUpDF/xmIYgdoG/ALgR3mZkBhD0aI7bj7anEalUXVmtr ab1O3xBvfq8AIHC0Meje3QAseB1Ee1U8zgOAVen2B8PqoqQfYIh6OS8kNE8lKIpZRIFinuj2 OXqlQtmKWkvuJdxnUhCFQCC8jpubU4zPgKHQTCCgTvigMl6ErrzYmKUu8DI6aFp6eHvsBB3G ZZQRjgKGWHkH4fly6bnIi8GqOnVFpTk+R7bKP6DI3t3D7dRxDwRWWSgSpZ4MwDMFsCeXwLcC yy1AJ0FRSmwlBA7edajHwCIefk+JqFLCF+coO2LOo69oHN5ZHfmTdLUuCyHGU744oK4HB4LE NYPX2cHeKIvKfphk0gAUNhtsgHH19KOhb3DucIETajYsvh5EYgYO+n3veIUQbyLN+p0E1lEF HmB9PnMIwBwlGO8h6wMx1whe+MeycEgwDweaLFPelBBVOJR5aKsE4qGCdgl12YcHsDWloqiT 3zeag6PJy71liKwGEBQ1UgGJUW+uYU1MhRMbRRLGQjYGZxZ813xgc8zU6IlrZep24+sh0HTj pCHVLGFCqimgQiWY7CcgiDUtpUjCKljQOrZFnBoIjIyMScC0TCFfe0FGRUBNQ86HUlBqXAww gwpIhlYhQpmQQgKxQEiHJtNJBQGDJhrimNHGCGEMREbkySQKQic41hEZyz7hwRBBBjARDoTJ JokNElyQZKhkZBCkkUhQiyCIyICikEUI+vBgxES2kJFEZ6norpJZtYJrFgeyLNFBieLU7Sfx WTMyAUGHhFkBtE61wdXw7MvGcpwOYS8qnBkpg5pM/Uxw1gRJwc5uRV5SBHCUQ1CcQKRUAU0a IXEjZebuwYUJ2zQ6HpM5BgiDe5mUh1I4ldvNRcZ+sat40ZKAdUgrqOwIGq81jbwWmEC46nU2 OGxrHQAiW0msuBIFm5QLpe7lrRXJogk3IGmKVzE/c9uPBd87+1D15teeQI9LhLQcgTcCTYie vmQPYFwkkYMIjgAeOQPBHqD93cMJRMEhR/f59MdGvbn55x8pvEuTQKRCiYOgKCbiipJOgqRY gEolE3XW5AsrmDxOtKdbDM0RoQOTNQ6IjoUQB8Rh4nzPhs3wqUsxMEEUMBAGSYXp3glKmhha KUCgKT29+kdXYTEvdZ1wYKU5YcudUjpP7Fww5EVGAcSxjZ0hRGYU0SddHWh1qA4MYxbFglLC xsQMazdU5wJgnOcidn7KFNgyicTDS5tIFUkwQl5AnveujesTz1CAvFl/FKRWyh9cPUH2Sk8I R2V3iZp5+oATl9HuHL4CPh6fZy8gNJnxe2GJCIhGCv1RD6sn1v1kA4LKFLuqhhNeQsxG7LNA /zBqCcAQ86B6CfZqEQOIREh4DtPEP98CFVtThmJAYHrCH8QaosxWJD8EhjAxIN8/nNDqlgDy 9ddHQ+v1Hri8MyokH1fMlyh8tQdwEhR6EC8iB0QHh5mSKH2lS5rs7HD4JUD5rqISwx7FkMPD H8Ag+9IhgbGT/mBx/p0HYUfLnkBGmPm/k4wjfEEcA9CGbzdYyWQNDT8UxGxw3eAwGEtLH99K BUSBSUk5Hkgfw0god31uptiqFQUpkTkOOyA4IfPByf68OeDsL+7ey65Vh2VhhzGLrvx7xzMB +EUJ8QgPEUYjZVDBpCipO5EmWkLA8PiVIEAkEe0LOcE0APQoCbEQwbTyHgVLJvPDqxULMZZ2 WNHtDNiCRAN6evMPoTMY+/nGpDsHhBFeLuYUEELiSrhrt7oUGsCOBFyOZogTs7YVeicbfEFO v49yfyxAtFJIkkIATCdsgXRVeqI+Xyh/DJBIFAQQoA9UDkFOYGZUu5UfGRfWEpxcw5ycuQYD kqvFil5YZKm/ABRwkHJxCVi+rOwsVcUJTxhAyrhzh8RGheoOE3lnp2gECEiHD3gicwB9RsIJ GAoaWiWAaAhBEiXiTXpPVle6GbqGQT3IFPSgryp7IRJoJhD3hDU4UD0OwciL+X7r9c9hhKEO CPJQ3OGQ3IsTCh7RQ9uHu8wp9ki7APcf6iMcSDk/ZIUnVU6qtvQCcYRQkDBESQhSAwUBQoGo DQSGMr95Cp0nJqpIpBV8phMINoylEhh9hZCXGZJQUlKCpcxgHBTBR4Zgi9BNyPnC6VAOgDjj BpSkYkhWhQbRbQo1BQoxLCgkClCFthCo2gIQpJT4MLIYqisQ9m37wgnn4e7ggAPPygEehEoU Dd7TlaVAiA8crIhkxfQlUBQVQansXgvF0s64Qc3TCopQGqizsEmKvGV1n7awHSFZdUunt77w ww85UX17dGyMj0HG43scJMqjgpKHno2BYgnt6Mq8my4eHF0VVxeqjyT5FSisqGJ5+t9Eql+r 1MLPyJfrNzcZkhkJJQs1WHnGD+G5Kddfk1rJ+BYy6O7+UP12ZuRFoQzJmEJNp4crgTwPQfJy I3DRhP0hSSaLzqZ71nV1GdUohkpC6B50rRMqoYcfjEkmDSCXivClkMQJMun5EGcm3naKUKDz eXTiZhS0vAhoI1AYEUJ2rTKNFnONtQ7OY8GWPsqwQILTKuwReBfli+ZahmDk5hyGcQhO6Z+T hwg4si2YBG9enbjDc7y2nbg8qngw9aqVM5cJCH1b4pqolsySQyxibKJ4ZPwhzrLQFcxxqcdH GBtUdWb2O8jsJiKsReWRzpQ3Mw2LppqSywz3013mmccTAUrsQzmE4jLCDenZgh3DtJRB3+vQ RlFZxHNi5bxg8N3sjIn8KGTSlopDBxBF23GPLLMSiyWchb1DwxsjRfcoJjcamHzFMoFUIg5i AYMI/wmTlVVMzByIl3Y0uxwd3nJyjEYGScnXfjLWZTNhcspTNlTzMFpk9j9CZpBMmMJuOM9x cq3fFUwF0RRBSVBsj8PlTZ8uTmRN3rjRLbQ3D5IYGDuUajK875qyA8aDWWYPQpgHDDScoDDh LUjCxY5lgBNK2pTSjaK8DnCBqTiZtql5FPQ0QQxcjtc046lh0gux4VWDBJE4p0mIXMs4QWP1 gHGYOE2Vbp1PeOyKRSEjutFzH8MRPGPY+Gk2bHcwKfHTnid3BOqHyYuFKYk84bEHi+FGFDu3 cl2O3VwBPPLzT9LSaMO0S0D6IDG42PBbzM6MjgVO91lY/5Y75mzMaeN3gxRY8du/PLc2zAfg lsCyPNISGtLTL72pvG/XyNfLHhvr45F6GCOsdo7kIOxctDSJohnBId4Ah0oTg/vXF5cPJRtq rs49GYGc83wuUUY8UGM+SjBgu4TOOdoamOy1RQyjacfzxNFCh4wq1I6fOK+LvkvKOVO18E0y 2xJQlnKeIfHT44Kwzm6z9rGZ2q6EPD1AhzKW3+eeV4Ovh7cRbJDhtOJaG+QVFIeYtznnOU3A hjosKlDnTEN4dmb3Q559++CFx0h1DvntMy1CLkWyAtsHyGwYTIzg2T4ghiVppAJfDMX5DQdG rQ31kozT0GG5rAKgIZ4OjYHIBj6HD8845Ol/rgR6+8MgNhASRSYdxT2N9nIGD3lYamdAKQD9 BgTdSPCTrdLmqa3pKzjhmvKLAMSQQoKFN5nRpAw5eibZDcSYknHU8c6BP5ieEzNSng7nfe4c 1ECRWXDBLtYrhwswwwztGZCCSSVaDEYkmCgC7UybTRlpqRHwqb1yfXF4hZx33DJtdLCxe/FQ KQCMunBhCcxbjVnNs9zsQ/jPJ8VJ9aVJ8hGSoxcCzU1JgPWRAM7x4k79LuVmDsleoikBYyOw kwmH6Y20QaczZGgaYFPk+FMPRbJZQ98QH5aBpE1gUYKpuDInMjToY9cOoMkTfTZxDdktZiLE D3K9QOXASjoa0UhOkaLux2BDAHHA8wwTMUh2IUoE4fVBluZ0H5DVMGQhzQM+nccF3+hmRk1S hSARHlp2pk53s2JKKxCS5ZGRMvExoImRJCeRAZKCBMCXjNhg1ts1m2GJCwv04DDbwlhAf0/n tC1gZIHIEimEg4+4zjAVTSaLLBx6ayqb5c3zmyozo+KPdsdr8tC1LA1km+Gnjmm2NBrPSbQG 4GIFsphg+ZQz/Bfn/T8eu4sWdFC3rB1uLtgjhPLEfp3y14VrDIQgQjCf8rO5zzqZQQmfglPx w4WSy+WdeqwFxeo4ji0sLDRLxBZPD0p0JflmzaEYVprWqiefTaHSz0ZMYpJPeUcXWhAy7HtA 2DEiGwb7jYcSbzQJDCGTzeOHFZYnl2b3h5w1nfATvPI9LvmxQMySw8FgbPQZLAKxPIw+csbi M+jGiG1qQa+A746BjrnPmHjgVAGoUyMgPRCJ7pBLZP58NIcLYRvroxOQOX9exR53I7unRb9t obgfoBurjAcjBBhhkOt54YInHuSlOrEU7LEOthO6mGlDkqMHp2OpS615JREvDvEOy8OHeMUh 0dlIjZEC0VQo92s+tOup33ztTxC0UGEoWEL1ZqHBhikqhksXufR0lTlMr6tA+uw2qheeZ11R bQxKCMejqbJa9N0wnlBRjPBSlVLKhROxlQFkMWiFGxRi1LG2LBtRBAsqyVlCTKsjISZxsltg c/o93wc7B4IiILGS8oTT1ybMMJPvswCgMikEli8cLDPMvl4+l+tHSHqhmUZgpChISZAgKWCI Y3Au4OQ8b0H95h3JTxvxO5cjaJi3sD5ehuULiCOxynHZCAiQhHkboIBbFwSDCbPRh0CACqTz IYswYYAhVJP+qDRZUGmKV2n/5+rqchEE/B3hrAqTCIgUKVQC/8OKbISgxwp9vtyOmKqZKaCr SzmU+1inXt669kDF6cC37AJkP0h84hcSgkiyf9BkDGZDqiidaRPffOUFRTiGVTlA+/z9PorA NRoQdibGhKgOEkFqC/MDB97MPcL3r4Y98Nm2vCt2hs8lKsmagk7Rje6gVW3dp7DGeM74W0Lm vBJkmeCeL5YrBgnu8KO4BWNcZO9a0a63jXBGCGszPY6OmjB3CHkluQIeXeepINbvt3saronG MZzrruoRipquIEIQNc2HG9tuiU8ufFIMw8glosOuG1aV9mjRSYQkxOXLRxVMXzSJ12Ntti22 zWw6y86HZp4NS0NwoFA8RVjFRLZarHYo/QZh+1PFuUYyjKBg3FNlGAYKsRhM/F8xxfMkujDJ pFOxiHA4oMm8klvDC8Fk7bHJMPKYg8OxFQUOYexXKhk7grYDVItq227IXJmGcyaOgsiASARv PfRwIgEyDGp2aWi8E0Q5gTN1cWOmbtkgAwwBBFcGuzDbE6YkSrOWcnmM4mjx56a8chnXAoVQ UkgkgAkKShxlgKwhzDFgwRjEe/u7zku/pI+EKWljMUihZ1ZRd23bsxpAkS2Tiew8NdlotTpp ZEmoB9djmxx3Qjri8wPVNnRrbJQRJRtMeeh4AHU77OplFrzXXnOGhMRnBo0sxxHGbYhxO8BG LH3TJ36MOyBgX4VyW7QKUEtZyTHOdlkOzdNA/PRiUYqBKsocrgu9IrLs7NzjXM7bpjEaRK0/ VSV2dOYZhmhMizlHEwjtyhVXQN1lVy1y7rTQ5PIEhVluxcAVB3WOSirKeFyMLKZkIlAoKEcK CU0y/lxDcYysvh9O+O44Uc2c91lUkOKxId44XdkUGB7w4ZF3dqblmN5zEDsxUyRxiQzIrIJT Z2I5Os4yOmQJtOxk2I5cfPPRrugd04mGd0uSNEjyxlYqWIbA4PPSbCUduTewWWMI0Jq2OOLb d+9lnLF5djVcAUz9MuR2bMDgQ6IacIDvsV9mHHSS1lGiOieOXpMczL6Y5TECfkbFsbVFKiAT UhxnZMmDuSPDs+QOLSfvziDHdZCNDmaWuSyxvuQDDrNNJRS6EKEWpp08TxR3nHbs4zuxRyVQ 5DMg1Aw6KWkhpoJKZtW+MAOOPGVljLrxqIunjKbxGjiPt3Vg/H5uOeblEmQvuZk+AeS9X39P OenOyfA5IaNwYrKHgmG05PaiovORR4qaT7zjFVqtR2zMBVXA74Zx9OxkH0B5MzTnlvoaUCIl RSEBMm4gNIkEEIoY7mhHAMCYHTNHPjrO++LMM23lQGMRc9LjNzIMgGDljY3gLCmOLIaBMJx8 IE3wqlC27pRaH6JuW8hXUNDSJWg0Uu40zDEStUKU1ImHjSUZlAbUtQstdNExSgS0L6vPQy/1 g0sDgIUAU4Xek5gG68DQRNFAtMpwBQ6kQJEShUCgF/ekX1m70JmHptQw9V12Uxlc/u0dAwZD JxIqGQO/ID8oqejoUG8IqGoG2Jpf5Yg7uHEEMjAKDwFcqVCJRoRkIpZoigYkooYEAKZech8P lq02wPHZ3zs4V5HBA2ZaCcRQ8UGPTOQohKRdliLfljW0OenZynE+JyZVQd03UAPZ+FvO6oqq S4pTES78Xi4nYLg8cOtkFxUYEPGCY1qdFzyAULEcNBtWh1EoIL9FlA5jIzDImCQg2phxbJNg SwsjFgq4aQoHwXYKyr5Gd6K93ttdJeEccUZ9sY9dNn42XsxQ+30makSCBgdClwMNpqG1OIZ5 WYMYt3BXiMFWEUTkFpk7q2ob/jyN5gIBrGtDxCbQQiXwSzy1zJ4w9ynXMYrmUMBCHRnfGiTp 31pYeOqOEzxnfNidHi85CxCjOEy0aHjvwdcN1PEUUXrXdS+JmACDBgQKxAnJPGkwUAgIGIUS nq0JRowQVU6EPEL7YsIPZwNDOUO5B7lNhHGCgXXpck63m53nXMKOhs4WUSxjHplhWjDIbm5x vLnW5EdHLhvo5ZHFMMJEqwJKp3Ub6HqTqo7zOgE4fPNIfm9/4kxqYIQYMRYPSivAtxH/oPyc aC2YZJBhx3bHW8FmKkD0pPNxKp1RET4IRChQPfgyAR+CRLiCn4oIJiCqFwUdU2h+PdRtMBg6 S4EP3lNFQadCfJOzgGPdxFFsAeMAfTTRLolnjStTAiRnep2TQAhog7xO2IEkigJsUE9BZ3BD k26m0RNptjDmoDudFM3hik9FtQhEhAE8EBwQLSOTDN82jSA5b85g+2W3m4bKq5sknHf4uF4J 10Z8eDaA+b4cxqVYoQ0PB0jjs7d7E0LDt7DDI1DNDMxQ9gyxA9Mp4hyYwgo5p0YomqCJKS6y JRExhDCOqr4vSYQgKCgBx2l6jWcwItto50vNR4jeTbUxPnD15CHcHuHh7YeBDpJlakCoEi6I jBQOCpAMks+lmyPNEMEL2MAGEvnkTGQdMgeglwIcgX9kGo1JszAGhHAkEiFCNCeyC3FOdUIb sl1NmfIF2Bktj4zmycIPmPu6JlKvpImA/Ij095kOk+j1FRLIsSgWBIUDygfTqcfieabnr4cD 648IneH75ZdlBi82STAgM4LyIJ896XBBQghmHgFEKFe7zVUrcZkEGGkQMoBAUEp6zCQjqlHE 85eDYvw4XjOVzAqH5CDkimKKMGrTRdcNaT6mBrE7RMTVJNhMGUzo9rACpIh9IPZystYopXIp uINetzhPY6kII5rIIUsx1ZeBDvru8kS5XoahwU34qVFeGoOoNSEgGYLREHC0kU4pzXlv0Dhs BIrhOHAlEEjwzJ2Io8kAkwGKqwfD8SCYmAICvtaYJ5lNfFqg0KIHaPxyaJkkYqdywa2BZ1BM aFiMpCzkszOo8O8dlTpiEWvAOgQtnm/mndOQOqaM+Qj2VO6JIBynW6B84IlE03YxDcIOqSJj kU04imovXjhPnhyG87ZIoH21VWXqz872xZZAhZIdnikXf4y9wmJ0PIiCDyMypCMMxap46lGD iZH0sbn+Y/9I5UmOpsR8qadrGYm3SDGDAUM0875OocCopZBfMoP2fk669NSqL9jgCqgFEcgp 8Fb9vy+oyB5RDTrQHKAVAkDnv26AGB0xVDpzA2KGn2QOtCSwD9pKnrT9kKe/AYyjEDCE8e0A /osNHU4H/ofR/XZzx6jFJ7nTK+mNaVcv7FPgtmlwqXndftcZMIyhD0EJnT8fF//jjjjDfoXw wQcog4R8Q7NMDDBalXWmu1ClLnJoWqkQKJkbqHLMw+ir3/h9F+wPeUfXGesQLD0xH0J4/L+N +xHaLmCbySABIgGaD+kZABZZ6LyPlbBEKFSiDG1IR+4yGRagykkP8crVIf+oXXiK6xfIaY6j 88GfWJ7x+54g+UUtHpBfsR4EZuUWHatS+7BpkqH2S1CGaKOCEQ+qJ66CigoqBSVkRCLiiu8/ RW/hWcF2ft/bWQ7zvnTt488BUGF8DSYl+KNjpeyqkqqKsbJIBZgScgnOykPfHoR7HL9iCSyB BLFSESUPTb0/KHPtlGJEpChd94egUD0ehAkJW/IDgPklk+5hZ5LaFmMWDJbOowVRn8/kMcKK hTD1BfVh54EUf44AMicd0A/DbcyqVAogDZN7Fh8vJfHGJcbSgMiwysHZGoekAcxv9ZghsSI0 eAXh6R+c1KJFGKiIgqIqRtqNLfvIfl/e+x45QLg5PixD5Y4gkIDi86sn4RI632IlQrIO1DOx vKg19B/50HiSQz7xwcUHftHc/6VQztvh9z3/4JcaKIKPxJLmBGRhEY0GFQGExKh3101DtAyM 2PkwnEa3B6LeP7FopYCJgYiVeDMnj6DcAJ2o48zSfFFEA4QPTLjMNUwiqKJqIQfW9/ePwalH 5ZwDt12mjY6Ic+pe84CohoijamIS1BEvyyPvjrAJJF8JRPBBKO57sxGcwCFCxIUPeNgmbGTJ WSIihrVBFh7miCRZGGhzVMhbKskZUKDElBwHMrHLAVmItsqWallUMU5g4cgAzQ4HGjBniHuP ++l0DA766hsoQKiwzSUZojVAVRULBhYVhYHEK4UsqhOAX1B/RTCgqZFwIn7sXcJ0cE4Amrcg bg7RcvN6Vf1B34D0xjCSaKoG8SxgbNqHFZw46pREOYbFBhP17h4QMpxssYQIRPEwXcXvLA+c YQHKaUfjhTB5JEcwEKgR7KHIDfxETZeK5+XthIZzxWURRBGACxoMmYsCPGWR4UMaqmmCY4Hs wagLBTUiHEouSUACGShkNAIMyiHv5hVKH1cYwD8edoHphmJJPDr2xEwz0gMIpKIGEDMjIeIt d99NcJ+H8FY+ScbR8CSCLy70qh1CEIhRBIK4gP9VU7w9dV1+K8fKUGbapkI8O42Lf1fChAOQ iB0K9EH88AKFQwohqvSXNu43ExgGOAYECKEBMcbqSqqe9MoFJYeQ0o/Iujr14gGpAoimk7Us VJPpIT4i6YZmYkIhlIBADmNc5Yq2jvpJ4FmVgEHYQYxHiPUHmn4kKCNPr5MTw+I7BmG+5BNO AFf/P+5/bKZltgj+uGZpjG8TIRDECdA8GnIoqChLIMCeHPpdw4TpjD0yHy0tMYKm3Q/9RtHb TnBTkId2hnEKB9bkjTpLsYw8QvJ/V/7MFndVHD4Bof901hHQzFXqKCL9k+d8wB2+Hm4Q6geI WCGWORQ1RRZGIWIzlQT64cOTy5PPUzAXahimM7LAtIWQRTTBBRBIDggVVYoqoqkshPQYN7TP 0rwnJUktEsUQRNTUtSyVRMsRLA8p/RBd+tBKkvNIaRzqBJ0R+cl/B6s8vIsTlO7eOdyOQSEE MIEJkgAcYQwe1hrBrrYu19phDv/L+pMSXLv8JHn9KAp3tKCCG+f7CmoM5hhWS4Q6Yd0zvjNl MYE0kTcIHG9QzP6oagLcRlEISQIcHxEzgqqR6Q7JkVcD2SxM5UQOla/EjOXN7zVMqa2A08st /sDhbYfjvMZl6zWRhTz05YA1b0lXqE2hkizZwiEHhQMcKsOzskRd040vEKNJotxm0K8AsKrz mtCkYQH22k4UsPVm6aUfSeQ+IvjDFBSI3nsMOI68tlQU9Pgh6QmO/BzA1HYxVBGmB+BrQsEs wJEoBU5A3RsQFHiQAt3oAAoYugxMrZSASZoGsenLMhYfF6YcfuMaX8vPnqA+/BfI8yfLAI0o teNl5Ip5gNvd5oKHoqbeghN4/woeUVTgHbQZCCQv5UIK/uQojlEOS4BCSSMSBASiBwVFI0IN KAfYl+HERwkAGT/7sgVBAUJEFRNEBExEIv9MBgw0oLqHZL/mg1fdgf/hCVMlTrnZnBklIu1t w4CWfutvrmrMvt0VJ2ii9d8lS+NoeO/C2Mof5XmU7iafu3wQwkgLm+E0pMI1TYu60piBdkEk rtntnUlF+ZC6R0cLIA5JcDjBTh5ZVUnidCOnf7/d16sbRwvuKe0SWZEJikOCGJIa5lml2yGy W204xLZFgrr8vqbhASv8Hp+f183n/fNv4fpd0VkAJCSSRD3xmw0hP/Sr3kSR/7yPxrNFKuDg psf6sUQYQewlf4iqh5Mf9BBGznrwm9Z6yv0mHTrnS8j0OOhLEkBgE7MQI+OBN8L41vsO4MkA fCKp//i7kinChIJ77jtY --------------070700040502090906060303-- From davem@redhat.com Sat Mar 29 04:09:53 2003 Received: with ECARTIS (v1.0.0; list netdev); Sat, 29 Mar 2003 04:10:33 -0800 (PST) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h2TC9Dq9007247 for ; Sat, 29 Mar 2003 04:09:53 -0800 Received: from localhost (IDENT:davem@localhost.localdomain [127.0.0.1]) by pizda.ninka.net (8.9.3/8.9.3) with ESMTP id EAA26273; Sat, 29 Mar 2003 04:06:05 -0800 Date: Sat, 29 Mar 2003 04:06:05 -0800 (PST) Message-Id: <20030329.040605.83500303.davem@redhat.com> To: toml@us.ibm.com Cc: netdev@oss.sgi.com, kuznet@ms2.inr.ac.ru Subject: Re: [PATCH] IPSec: Missing IPv6 policy checks From: "David S. Miller" In-Reply-To: <1048864940.14454.10.camel@tomlt2.tomloffice.austin.ibm.com> References: <1048864940.14454.10.camel@tomlt2.tomloffice.austin.ibm.com> X-FalunGong: Information control. X-Mailer: Mew version 2.1 on Emacs 21.1 / Mule 5.0 (SAKAKI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) X-archive-position: 2096 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev Content-Length: 252 Lines: 9 From: Tom Lendacky Date: 28 Mar 2003 09:22:15 -0600 Please review and let me know if I should make any changes. Applied, thanks. SCTP was missing no_policy=1 for both ipv4/ipv6 and I took care of that at the same time. From davem@redhat.com Sat Mar 29 04:16:01 2003 Received: with ECARTIS (v1.0.0; list netdev); Sat, 29 Mar 2003 04:16:06 -0800 (PST) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h2TCFNq9007571 for ; Sat, 29 Mar 2003 04:16:01 -0800 Received: from localhost (IDENT:davem@localhost.localdomain [127.0.0.1]) by pizda.ninka.net (8.9.3/8.9.3) with ESMTP id EAA26283; Sat, 29 Mar 2003 04:12:15 -0800 Date: Sat, 29 Mar 2003 04:12:15 -0800 (PST) Message-Id: <20030329.041215.124919097.davem@redhat.com> To: toml@us.ibm.com Cc: netdev@oss.sgi.com, kuznet@ms2.inr.ac.ru Subject: Re: [PATCH] IPSec: IPv6 AH/ESP fixes From: "David S. Miller" In-Reply-To: <1048870457.16800.5.camel@tomlt2.tomloffice.austin.ibm.com> References: <1048870457.16800.5.camel@tomlt2.tomloffice.austin.ibm.com> X-FalunGong: Information control. X-Mailer: Mew version 2.1 on Emacs 21.1 / Mule 5.0 (SAKAKI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) X-archive-position: 2097 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev Content-Length: 647 Lines: 24 From: Tom Lendacky Date: 28 Mar 2003 10:54:16 -0600 Below is a patch for your consideration for some AH/ESP problems that I encountered during tunnel mode testings. Please review and let me know if any changes are required. Looks good, applied. One comment: @@ -287,7 +287,7 @@ x->props.header_len = XFRM_ALIGN8(ahp->icv_trunc_len + AH_HLEN_NOICV); if (x->props.mode) - x->props.header_len += 20; + x->props.header_len += 40; x->data = ahp; return 0; Yuck, let's get rid of these constants and use sizeof(ipv6_hdr) or whatever this is supposed to me :-) From ak@suse.de Sat Mar 29 04:29:28 2003 Received: with ECARTIS (v1.0.0; list netdev); Sat, 29 Mar 2003 04:29:35 -0800 (PST) Received: from Cantor.suse.de (ns.suse.de [213.95.15.193]) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h2TCTQq9008066 for ; Sat, 29 Mar 2003 04:29:28 -0800 Received: from Hermes.suse.de (Hermes.suse.de [213.95.15.136]) by Cantor.suse.de (Postfix) with ESMTP id 150971453D; Sat, 29 Mar 2003 13:29:21 +0100 (MET) Subject: Re: NIC renaming does not rename /proc/sys/net/ipv4 Was: Re: NICs trading places ? From: Andi Kleen To: bert hubert Cc: netdev@oss.sgi.com, Dave Jones , linux-kernel@vger.kernel.org In-Reply-To: <20030329121755.GA17169@outpost.ds9a.nl> References: <20030328221037.GB25846@suse.de.suse.lists.linux.kernel> <20030329121755.GA17169@outpost.ds9a.nl> Content-Type: text/plain Content-Transfer-Encoding: 7bit X-Mailer: Ximian Evolution 1.0.8 Date: 29 Mar 2003 13:29:19 +0100 Message-Id: <1048940960.2176.86.camel@averell> Mime-Version: 1.0 X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) X-archive-position: 2098 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: ak@suse.de Precedence: bulk X-list: netdev Content-Length: 755 Lines: 22 On Sat, 2003-03-29 at 13:17, bert hubert wrote: > On Sat, Mar 29, 2003 at 05:47:17AM +0100, Andi Kleen wrote: > > Dave Jones writes: > > > > > I just upgraded a box with 2 NICs in it to 2.5.66, and found > > > that what was eth0 in 2.4 is now eth1, and vice versa. > > > Is this phenomenon intentional ? documented ? > > > > Just assign mac addresses to names and run nameif early in boot. > > A slight problem with that is that not all parts of /proc/sys get renamed > this way: Just rename at early boot before IP is set up. That is what i usually do - set up /etc/mactab and run it very early at boot. Running it later is usually flakey. e.g. it can also give confusing effects with old style named ip aliases. -Andi From ahu@outpost.ds9a.nl Sat Mar 29 04:58:57 2003 Received: with ECARTIS (v1.0.0; list netdev); Sat, 29 Mar 2003 04:59:04 -0800 (PST) Received: from outpost.ds9a.nl (postfix@outpost.ds9a.nl [213.244.168.210]) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h2TCw1q9008550 for ; Sat, 29 Mar 2003 04:58:46 -0800 Received: by outpost.ds9a.nl (Postfix, from userid 1000) id 89FB6410C; Sat, 29 Mar 2003 13:17:55 +0100 (CET) Date: Sat, 29 Mar 2003 13:17:55 +0100 From: bert hubert To: Andi Kleen , netdev@oss.sgi.com Cc: Dave Jones , linux-kernel@vger.kernel.org Subject: NIC renaming does not rename /proc/sys/net/ipv4 Was: Re: NICs trading places ? Message-ID: <20030329121755.GA17169@outpost.ds9a.nl> Mail-Followup-To: bert hubert , Andi Kleen , netdev@oss.sgi.com, Dave Jones , linux-kernel@vger.kernel.org References: <20030328221037.GB25846@suse.de.suse.lists.linux.kernel> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.3.28i X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) X-archive-position: 2099 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: ahu@ds9a.nl Precedence: bulk X-list: netdev Content-Length: 1083 Lines: 31 On Sat, Mar 29, 2003 at 05:47:17AM +0100, Andi Kleen wrote: > Dave Jones writes: > > > I just upgraded a box with 2 NICs in it to 2.5.66, and found > > that what was eth0 in 2.4 is now eth1, and vice versa. > > Is this phenomenon intentional ? documented ? > > Just assign mac addresses to names and run nameif early in boot. A slight problem with that is that not all parts of /proc/sys get renamed this way: snapcount:/proc/sys/net/ipv4/conf# ifconfig lo down snapcount:/proc/sys/net/ipv4/conf# ip link set name lo0 lo snapcount:/proc/sys/net/ipv4/conf# ls -l total 0 dr-xr-xr-x 2 root root 0 Mar 29 13:16 all dr-xr-xr-x 2 root root 0 Mar 29 13:16 default dr-xr-xr-x 2 root root 0 Mar 29 13:16 eth0 dr-xr-xr-x 2 root root 0 Mar 29 13:16 lo Which can be very confusing. This problem exists in both 2.5 and 2.4. Regards, bert -- http://www.PowerDNS.com Open source, database driven DNS Software http://lartc.org Linux Advanced Routing & Traffic Control HOWTO From yoshfuji@linux-ipv6.org Sun Mar 30 05:09:16 2003 Received: with ECARTIS (v1.0.0; list netdev); Sun, 30 Mar 2003 05:09:26 -0800 (PST) Received: from yue.hongo.wide.ad.jp (yue.hongo.wide.ad.jp [203.178.139.94]) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h2UD8Zq9001078 for ; Sun, 30 Mar 2003 05:09:16 -0800 Received: from localhost (localhost [127.0.0.1]) by yue.hongo.wide.ad.jp (8.12.3+3.5Wbeta/8.12.3/Debian-5) with ESMTP id h2UD8TDG031461; Sun, 30 Mar 2003 22:08:29 +0900 Date: Sun, 30 Mar 2003 22:08:29 +0900 (JST) Message-Id: <20030330.220829.129728506.yoshfuji@linux-ipv6.org> To: usagi-users@linux-ipv6.org, pioppo@ferrara.linux.it Cc: netdev@oss.sgi.com, linux-kernel@vger.kernel.org, ds6-devel@deepspace6.net Subject: Re: (usagi-users 02296) IPv6 duplicate address bugfix From: YOSHIFUJI Hideaki / =?iso-2022-jp?B?GyRCNUhGIzFRTEAbKEI=?= In-Reply-To: <20030330122705.GA18283@ferrara.linux.it> References: <20030330122705.GA18283@ferrara.linux.it> Organization: USAGI Project X-URL: http://www.yoshifuji.org/%7Ehideaki/ X-Fingerprint: 90 22 65 EB 1E CF 3A D1 0B DF 80 D8 48 07 F8 94 E0 62 0E EA X-PGP-Key-URL: http://www.yoshifuji.org/%7Ehideaki/hideaki@yoshifuji.org.asc X-Face: "5$Al-.M>NJ%a'@hhZdQm:."qn~PA^gq4o*>iCFToq*bAi#4FRtx}enhuQKz7fNqQz\BYU] $~O_5m-9'}MIs`XGwIEscw;e5b>n"B_?j/AkL~i/MEaZBLP X-Mailer: Mew version 2.2 on Emacs 20.7 / Mule 4.1 (AOI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) X-archive-position: 2101 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: yoshfuji@linux-ipv6.org Precedence: bulk X-list: netdev Content-Length: 1606 Lines: 41 In article <20030330122705.GA18283@ferrara.linux.it> (at Sun, 30 Mar 2003 14:27:05 +0200), Simone Piunno says: > When adding an IPv6 address to a given interface, I'm allowed to > add that address multiple time, e.g.: > > [root@abulafia root]# ip addr add 3ffe:4242:4242::1 dev eth0 > [root@abulafia root]# ip addr add 3ffe:4242:4242::1 dev eth0 > [root@abulafia root]# ip addr add 3ffe:4242:4242::1 dev eth0 > [root@abulafia root]# ip addr show dev eth0 > 2: eth0: mtu 1500 qdisc pfifo_fast qlen 100 > link/ether 00:48:54:1b:25:30 brd ff:ff:ff:ff:ff:ff > inet6 3ffe:4242:4242::1/128 scope global > inet6 3ffe:4242:4242::1/128 scope global > inet6 3ffe:4242:4242::1/128 scope global > inet6 fe80::248:54ff:fe1b:2530/10 scope link > > Apparently, this is not a stability problem, because I'm allowed to > delete 3 times that address before receving a "not found" error, > but there's no reason to allow multiple instances of the same address > on the same interface, so this is a bug nonetheless. > > Bug is confirmed on: > - 2.4.20 > - 2.5.66 > - latest -usagi usagi code does not act like that. In my environment, # ip addr add 3ffe:4242:4242::1 dev eth0 # ip addr add 3ffe:4242:4242::1 dev eth0 RTNETLINK answers: No buffer space available # ip addr add 3ffe:4242:4242::1 dev eth0 RTNETLINK answers: No buffer space available And, patch does not seem optimal. I'd take a look at very soon. -- Hideaki YOSHIFUJI @ USAGI Project GPG FP: 9022 65EB 1ECF 3AD1 0BDF 80D8 4807 F894 E062 0EEA From Dimitry.Ketov@avalon.ru Sun Mar 30 06:12:47 2003 Received: with ECARTIS (v1.0.0; list netdev); Sun, 30 Mar 2003 06:12:51 -0800 (PST) Received: from smtp.avalon.ru (ns.avalon.ru [195.209.229.227]) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h2UEC8q9001888 for ; Sun, 30 Mar 2003 06:12:47 -0800 content-class: urn:content-classes:message MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Subject: fixme: possibly bug in some sch_* qdiscs? X-MimeOLE: Produced By Microsoft Exchange V6.0.6249.0 Date: Sun, 30 Mar 2003 18:11:45 +0400 Message-ID: X-MS-Has-Attach: X-MS-TNEF-Correlator: Thread-Topic: fixme: possibly bug in some sch_* qdiscs? Thread-Index: AcL2BZM+86EVewkGQc+3orAwc2LBoQ== From: "Dimitry V. Ketov" To: Cc: , , Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by oss.sgi.com id h2UEC8q9001888 X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) X-archive-position: 2102 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: Dimitry.Ketov@avalon.ru Precedence: bulk X-list: netdev Content-Length: 1433 Lines: 30 The matter of problem is: Some qdiscs (e.g sch_prio) don't destory their filter lists, when someone deletes qdisc from interface without explicit filter deleting before: # tc qdisc add dev eth0 root handle 1: prio # tc filter add dev eth0 parent 1: pref 1 protocol ip u32 match icmp type 8 0xff classid 1:1 # tc qdisc del dev eth0 root As i see (fixme), last tc command forces rtnetlink code to call tc_get_qdisc() from net/sched/sch_api.c, which in turn, calls qdisc_destroy() from net/sched/sch_generic.c, which calls qdisc operations' reset(), then destroy(), then frees memory if needed. Unfortunately prio_destroy() from net/sched/sch_prio.c code does not implement (i start digging 2.4.18 code, then checked 2.4.20, then 2.5.66) explicit destroying its filter_list from private data, and losts that pointer. I think it causes memory leackage, when we repeating 'tc qdisc del' operation without explicit 'tc filter del' operations. Next obvious effect, that in turn cls_u32.o module does not decrement its usage counter, but increment it on each 'tc filter add' command. And, at some circumstances 'tc filter show' command shows a few filters, after I added only one! (think it sees last filters from previous instances of sch_prio) Fortunately but only sch_cbq.c, sch_atm.c do their destroy() in the right way... Which kernel maintainer I need to contact with to fix that problem (if it is the problem, of course ;) Dmitry. From yoshfuji@linux-ipv6.org Sun Mar 30 06:58:54 2003 Received: with ECARTIS (v1.0.0; list netdev); Sun, 30 Mar 2003 06:58:58 -0800 (PST) Received: from yue.hongo.wide.ad.jp (yue.hongo.wide.ad.jp [203.178.139.94]) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h2UEwCq9003431 for ; Sun, 30 Mar 2003 06:58:53 -0800 Received: from localhost (localhost [127.0.0.1]) by yue.hongo.wide.ad.jp (8.12.3+3.5Wbeta/8.12.3/Debian-5) with ESMTP id h2UEw9DG031980; Sun, 30 Mar 2003 23:58:09 +0900 Date: Sun, 30 Mar 2003 23:58:09 +0900 (JST) Message-Id: <20030330.235809.70243437.yoshfuji@linux-ipv6.org> To: davem@redhat.com, kuznet@ms2.inr.ac.ru CC: netdev@oss.sgi.com, linux-kernel@vger.kernel.org, usagi@linux-ipv6.org, yoshfuji@linux-ipv6.org, pioppo@ferrara.linux.it Subject: [PATCH] IPv6: Don't assign a same IPv6 address on a same interface (is Re: IPv6 duplicate address bugfix) From: YOSHIFUJI Hideaki / =?iso-2022-jp?B?GyRCNUhGIzFRTEAbKEI=?= In-Reply-To: <20030330.220829.129728506.yoshfuji@linux-ipv6.org> References: <20030330122705.GA18283@ferrara.linux.it> <20030330.220829.129728506.yoshfuji@linux-ipv6.org> Organization: USAGI Project X-URL: http://www.yoshifuji.org/%7Ehideaki/ X-Fingerprint: 90 22 65 EB 1E CF 3A D1 0B DF 80 D8 48 07 F8 94 E0 62 0E EA X-PGP-Key-URL: http://www.yoshifuji.org/%7Ehideaki/hideaki@yoshifuji.org.asc X-Face: "5$Al-.M>NJ%a'@hhZdQm:."qn~PA^gq4o*>iCFToq*bAi#4FRtx}enhuQKz7fNqQz\BYU] $~O_5m-9'}MIs`XGwIEscw;e5b>n"B_?j/AkL~i/MEaZBLP X-Mailer: Mew version 2.2 on Emacs 20.7 / Mule 4.1 (AOI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=iso-2022-jp Content-Transfer-Encoding: 7bit X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) X-archive-position: 2103 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: yoshfuji@linux-ipv6.org Precedence: bulk X-list: netdev Content-Length: 3565 Lines: 115 In article <20030330.220829.129728506.yoshfuji@linux-ipv6.org> (at Sun, 30 Mar 2003 22:08:29 +0900 (JST)), YOSHIFUJI Hideaki / $B5HF#1QL@(B says: > And, patch does not seem optimal. I'd take a look at very soon. Here's our patch based on our fix in August, 2001. Question: should we use spin_lock_bh() instead of spin_lock()? -------- Don't assign a same IPv6 address on a same interface. This patch is against linux-2.5.66. We believe this fix should be suitable on linux-2.4 tree. (This patch itself conflicts at the first chunk...) Thanks in advance. ------------------------------------------------------------------- Patch-Name: Don't assign a same IPv6 address on a same interface Patch-Id: FIX_2_5_66_ADDRCONF_DUPADDR-20030330 Patch-Author: YOSHIFUJI Hideaki / USAGI Project Credit: Yuji SEKIYA / USAGI Project , YOSHIFUJI Hideaki / USAGI Project , Simone Piunno ------------------------------------------------------------------- Index: net/ipv6/addrconf.c =================================================================== RCS file: /cvsroot/usagi/usagi-backport/linux25/net/ipv6/addrconf.c,v retrieving revision 1.1.1.9 retrieving revision 1.1.1.9.2.3 diff -u -r1.1.1.9 -r1.1.1.9.2.3 --- net/ipv6/addrconf.c 25 Mar 2003 04:33:45 -0000 1.1.1.9 +++ net/ipv6/addrconf.c 30 Mar 2003 13:50:41 -0000 1.1.1.9.2.3 @@ -30,6 +30,8 @@ * address validation timer. * YOSHIFUJI Hideaki @USAGI : Privacy Extensions (RFC3041) * support. + * Yuji SEKIYA @USAGI : Don't assign a same IPv6 + * address on a same interface. */ #include @@ -126,6 +128,8 @@ static void addrconf_rs_timer(unsigned long data); static void ipv6_ifa_notify(int event, struct inet6_ifaddr *ifa); +static int ipv6_chk_same_addr(const struct in6_addr *addr, struct net_device *dev); + static struct notifier_block *inet6addr_chain; struct ipv6_devconf ipv6_devconf = @@ -492,10 +496,21 @@ { struct inet6_ifaddr *ifa; int hash; + static spinlock_t lock = SPIN_LOCK_UNLOCKED; + + spin_lock(&lock); + + /* Ignore adding duplicate addresses on an interface */ + if (ipv6_chk_same_addr(addr, idev->dev)) { + spin_unlock(&lock); + ADBG(("ipv6_add_addr: already assigned\n")); + return NULL; + } ifa = kmalloc(sizeof(struct inet6_ifaddr), GFP_ATOMIC); if (ifa == NULL) { + spin_unlock(&lock); ADBG(("ipv6_add_addr: malloc failed\n")); return NULL; } @@ -514,6 +529,7 @@ if (idev->dead) { read_unlock(&addrconf_lock); kfree(ifa); + spin_unlock(&lock); return NULL; } @@ -551,6 +567,7 @@ in6_ifa_hold(ifa); write_unlock_bh(&idev->lock); read_unlock(&addrconf_lock); + spin_unlock(&lock); notifier_call_chain(&inet6addr_chain,NETDEV_UP,ifa); @@ -921,6 +938,23 @@ !(ifp->flags&IFA_F_TENTATIVE)) { if (dev == NULL || ifp->idev->dev == dev || !(ifp->scope&(IFA_LINK|IFA_HOST))) + break; + } + } + read_unlock_bh(&addrconf_hash_lock); + return ifp != NULL; +} + +static +int ipv6_chk_same_addr(const struct in6_addr *addr, struct net_device *dev) +{ + struct inet6_ifaddr * ifp; + u8 hash = ipv6_addr_hash(addr); + + read_lock_bh(&addrconf_hash_lock); + for(ifp = inet6_addr_lst[hash]; ifp; ifp=ifp->lst_next) { + if (ipv6_addr_cmp(&ifp->addr, addr) == 0) { + if (dev != NULL && ifp->idev->dev == dev) break; } } -- Hideaki YOSHIFUJI @ USAGI Project GPG FP: 9022 65EB 1ECF 3AD1 0BDF 80D8 4807 F894 E062 0EEA From yoshfuji@linux-ipv6.org Sun Mar 30 10:36:19 2003 Received: with ECARTIS (v1.0.0; list netdev); Sun, 30 Mar 2003 10:36:25 -0800 (PST) Received: from yue.hongo.wide.ad.jp (yue.hongo.wide.ad.jp [203.178.139.94]) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h2UIZbq9005613 for ; Sun, 30 Mar 2003 10:36:18 -0800 Received: from localhost (localhost [127.0.0.1]) by yue.hongo.wide.ad.jp (8.12.3+3.5Wbeta/8.12.3/Debian-5) with ESMTP id h2UIZODG032724; Mon, 31 Mar 2003 03:35:24 +0900 Date: Mon, 31 Mar 2003 03:35:24 +0900 (JST) Message-Id: <20030331.033524.114862210.yoshfuji@linux-ipv6.org> To: pioppo@ferrara.linux.it Cc: davem@redhat.com, kuznet@ms2.inr.ac.ru, netdev@oss.sgi.com, linux-kernel@vger.kernel.org, usagi@linux-ipv6.org Subject: Re: [PATCH] IPv6: Don't assign a same IPv6 address on a same interface (is Re: IPv6 duplicate address bugfix) From: YOSHIFUJI Hideaki / =?iso-2022-jp?B?GyRCNUhGIzFRTEAbKEI=?= In-Reply-To: <20030330163656.GA18645@ferrara.linux.it> References: <20030330.220829.129728506.yoshfuji@linux-ipv6.org> <20030330.235809.70243437.yoshfuji@linux-ipv6.org> <20030330163656.GA18645@ferrara.linux.it> Organization: USAGI Project X-URL: http://www.yoshifuji.org/%7Ehideaki/ X-Fingerprint: 90 22 65 EB 1E CF 3A D1 0B DF 80 D8 48 07 F8 94 E0 62 0E EA X-PGP-Key-URL: http://www.yoshifuji.org/%7Ehideaki/hideaki@yoshifuji.org.asc X-Face: "5$Al-.M>NJ%a'@hhZdQm:."qn~PA^gq4o*>iCFToq*bAi#4FRtx}enhuQKz7fNqQz\BYU] $~O_5m-9'}MIs`XGwIEscw;e5b>n"B_?j/AkL~i/MEaZBLP X-Mailer: Mew version 2.2 on Emacs 20.7 / Mule 4.1 (AOI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) X-archive-position: 2104 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: yoshfuji@linux-ipv6.org Precedence: bulk X-list: netdev Content-Length: 1643 Lines: 45 In article <20030330163656.GA18645@ferrara.linux.it> (at Sun, 30 Mar 2003 18:36:56 +0200), Simone Piunno says: > Because everywhere else in the file {read,write}_lock_bh() is used > instead of {read,write}_lock(), so I'm assuming that _bh is required > but I really don't know why. maybe. > - locking inside ipv6_add_addr() is simpler and more linear but > semantically wrong because you're unable to tell the user why his > "ip addr add" failed. E.g. you answer ENOBUFS instead of EEXIST. We don't want to create duplicate address in any case. ipv6_add_addr() IS right place. And, we can return error code by using IS_ERR() etc. I'll fix this. > - your ipv6_chk_same_addr() does a useless check for (dev != NULL) > > > +static > > +int ipv6_chk_same_addr(const struct in6_addr *addr, struct net_device *dev) > > +{ > > + struct inet6_ifaddr * ifp; > > + u8 hash = ipv6_addr_hash(addr); > > + > > + read_lock_bh(&addrconf_hash_lock); > > + for(ifp = inet6_addr_lst[hash]; ifp; ifp=ifp->lst_next) { > > + if (ipv6_addr_cmp(&ifp->addr, addr) == 0) { > > + if (dev != NULL && ifp->idev->dev == dev) > > break; > > } > > your never "break" if dev == NULL, so you could return 0 before > even acquiring the lock. It is not a problem because dev is always non-NULL. However, it should be dev == NULL || ifp->idev->dev == dev. Thanks. (I don't understand what you mean by "you could return 0 before even acquiring the lock.") -- Hideaki YOSHIFUJI @ USAGI Project GPG FP: 9022 65EB 1ECF 3AD1 0BDF 80D8 4807 F894 E062 0EEA From pioppo@ferrara.linux.it Sun Mar 30 12:17:16 2003 Received: with ECARTIS (v1.0.0; list netdev); Sun, 30 Mar 2003 12:17:25 -0800 (PST) Received: from smtp4.cp.tin.it (vsmtp4.tin.it [212.216.176.224]) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h2UKGYq9006557 for ; Sun, 30 Mar 2003 12:17:16 -0800 Received: from abulafia.casa (62.211.159.111) by smtp4.cp.tin.it (6.5.033) id 3E71D11500684BD0; Sun, 30 Mar 2003 14:28:23 +0200 Date: Sun, 30 Mar 2003 14:27:05 +0200 From: Simone Piunno To: netdev@oss.sgi.com, usagi-users@linux-ipv6.org Cc: linux-kernel@vger.kernel.org, ds6-devel@deepspace6.net Subject: IPv6 duplicate address bugfix Message-ID: <20030330122705.GA18283@ferrara.linux.it> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.4i Organization: Ferrara LUG X-Operating-System: Linux 2.4.20-skas3 X-Message: GnuPG/PGP5 are welcome X-Key-ID: 860314FC/C09E842C X-Key-FP: 9C15F0D3E3093593AC952C92A0CD52B4860314FC X-Key-URL: http://members.ferrara.linux.it/pioppo/mykey.asc X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) X-archive-position: 2105 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: pioppo@ferrara.linux.it Precedence: bulk X-list: netdev Content-Length: 3576 Lines: 126 Hi, When adding an IPv6 address to a given interface, I'm allowed to add that address multiple time, e.g.: [root@abulafia root]# ip addr add 3ffe:4242:4242::1 dev eth0 [root@abulafia root]# ip addr add 3ffe:4242:4242::1 dev eth0 [root@abulafia root]# ip addr add 3ffe:4242:4242::1 dev eth0 [root@abulafia root]# ip addr show dev eth0 2: eth0: mtu 1500 qdisc pfifo_fast qlen 100 link/ether 00:48:54:1b:25:30 brd ff:ff:ff:ff:ff:ff inet6 3ffe:4242:4242::1/128 scope global inet6 3ffe:4242:4242::1/128 scope global inet6 3ffe:4242:4242::1/128 scope global inet6 fe80::248:54ff:fe1b:2530/10 scope link Apparently, this is not a stability problem, because I'm allowed to delete 3 times that address before receving a "not found" error, but there's no reason to allow multiple instances of the same address on the same interface, so this is a bug nonetheless. Bug is confirmed on: - 2.4.20 - 2.5.66 - latest -usagi Following is a patch attempting to fix this bug. It's for 2.4.20 but sould apply cleanly on 2.5 too. Credits: Chad N. Tindel - discovered the bug and showed it to me Peter Bieringer - confirmed it's a bug Mauro Tortonesi - suggested sending a patch to this list. Regards, Simone Piunno --- net/ipv6/addrconf.c.orig 2003-03-25 21:33:55.000000000 +0100 +++ net/ipv6/addrconf.c 2003-03-30 13:48:23.000000000 +0200 @@ -89,6 +89,8 @@ static struct inet6_ifaddr *inet6_addr_lst[IN6_ADDR_HSIZE]; static rwlock_t addrconf_hash_lock = RW_LOCK_UNLOCKED; +static spinlock_t addrconf_add_lock = SPIN_LOCK_UNLOCKED; + /* Protects inet6 devices */ rwlock_t addrconf_lock = RW_LOCK_UNLOCKED; @@ -621,6 +623,24 @@ return ifp != NULL; } +static struct inet6_ifaddr * +ipv6_addr_already_present(struct in6_addr *addr, struct net_device *dev) +{ + struct inet6_ifaddr *ifp; + u8 hash = ipv6_addr_hash(addr); + + read_lock_bh(&addrconf_hash_lock); + for (ifp = inet6_addr_lst[hash]; ifp; ifp = ifp->lst_next) { + if (ipv6_addr_cmp(&ifp->addr, addr) == 0 && ifp->idev->dev == dev) { + read_unlock_bh(&addrconf_hash_lock); + return ifp; + } + } + read_unlock_bh(&addrconf_hash_lock); + return NULL; +} + + struct inet6_ifaddr * ipv6_get_ifaddr(struct in6_addr *addr, struct net_device *dev) { struct inet6_ifaddr * ifp; @@ -908,7 +928,7 @@ return; ok: - + spin_lock_bh(&addrconf_add_lock); ifp = ipv6_get_ifaddr(&addr, dev); if (ifp == NULL && valid_lft) { @@ -920,12 +940,14 @@ addr_type&IPV6_ADDR_SCOPE_MASK, 0); if (ifp == NULL) { + spin_unlock_bh(&addrconf_add_lock); in6_dev_put(in6_dev); return; } addrconf_dad_start(ifp); } + spin_unlock_bh(&addrconf_add_lock); if (ifp && valid_lft == 0) { ipv6_del_addr(ifp); @@ -1033,11 +1055,19 @@ scope = ipv6_addr_scope(pfx); + spin_lock_bh(&addrconf_add_lock); + if (ipv6_addr_already_present(pfx, dev)) { + spin_unlock_bh(&addrconf_add_lock); + return -EEXIST; + } + if ((ifp = ipv6_add_addr(idev, pfx, plen, scope, IFA_F_PERMANENT)) != NULL) { + spin_unlock_bh(&addrconf_add_lock); addrconf_dad_start(ifp); in6_ifa_put(ifp); return 0; } + spin_unlock_bh(&addrconf_add_lock); return -ENOBUFS; } -- Simone Piunno -- http://members.ferrara.linux.it/pioppo .------- Adde parvum parvo magnus acervus erit -------. Ferrara Linux Users Group - http://www.ferrara.linux.it Deep Space 6, IPv6 on Linux - http://www.deepspace6.net GNU Mailman, Mailing List Manager - http://www.list.org `-------------------------------------------------------' From yoshfuji@linux-ipv6.org Sun Mar 30 17:35:07 2003 Received: with ECARTIS (v1.0.0; list netdev); Sun, 30 Mar 2003 17:35:21 -0800 (PST) Received: from yue.hongo.wide.ad.jp (yue.hongo.wide.ad.jp [203.178.139.94]) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h2V1Z5q9010466 for ; Sun, 30 Mar 2003 17:35:06 -0800 Received: from localhost (localhost [127.0.0.1]) by yue.hongo.wide.ad.jp (8.12.3+3.5Wbeta/8.12.3/Debian-5) with ESMTP id h2V1YqDG001965; Mon, 31 Mar 2003 10:34:53 +0900 Date: Mon, 31 Mar 2003 10:34:51 +0900 (JST) Message-Id: <20030331.103451.118020141.yoshfuji@linux-ipv6.org> To: davem@redhat.com, kuznet@ms2.inr.ac.ru CC: netdev@oss.sgi.com, linux-kernel@vger.kernel.org, usagi@linux-ipv6.org, pioppo@ferrara.linux.it Subject: Re: [PATCH] IPv6: Don't assign a same IPv6 address on a same interface From: YOSHIFUJI Hideaki / =?iso-2022-jp?B?GyRCNUhGIzFRTEAbKEI=?= In-Reply-To: <20030331.033524.114862210.yoshfuji@linux-ipv6.org> References: <20030330.235809.70243437.yoshfuji@linux-ipv6.org> <20030330163656.GA18645@ferrara.linux.it> <20030331.033524.114862210.yoshfuji@linux-ipv6.org> Organization: USAGI Project X-URL: http://www.yoshifuji.org/%7Ehideaki/ X-Fingerprint: 90 22 65 EB 1E CF 3A D1 0B DF 80 D8 48 07 F8 94 E0 62 0E EA X-PGP-Key-URL: http://www.yoshifuji.org/%7Ehideaki/hideaki@yoshifuji.org.asc X-Face: "5$Al-.M>NJ%a'@hhZdQm:."qn~PA^gq4o*>iCFToq*bAi#4FRtx}enhuQKz7fNqQz\BYU] $~O_5m-9'}MIs`XGwIEscw;e5b>n"B_?j/AkL~i/MEaZBLP X-Mailer: Mew version 2.2 on Emacs 20.7 / Mule 4.1 (AOI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=iso-2022-jp Content-Transfer-Encoding: 7bit X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) X-archive-position: 2106 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: yoshfuji@linux-ipv6.org Precedence: bulk X-list: netdev Content-Length: 5287 Lines: 185 In article <20030331.033524.114862210.yoshfuji@linux-ipv6.org> (at Mon, 31 Mar 2003 03:35:24 +0900 (JST)), YOSHIFUJI Hideaki / $B5HF#1QL@(B says: > In article <20030330163656.GA18645@ferrara.linux.it> (at Sun, 30 Mar 2003 18:36:56 +0200), Simone Piunno says: > > > - locking inside ipv6_add_addr() is simpler and more linear but > > semantically wrong because you're unable to tell the user why his > > "ip addr add" failed. E.g. you answer ENOBUFS instead of EEXIST. > > We don't want to create duplicate address in any case. > ipv6_add_addr() IS right place. > And, we can return error code by using IS_ERR() etc. > I'll fix this. Here's the revised patch. Thank you. Index: net/ipv6/addrconf.c =================================================================== RCS file: /cvsroot/usagi/usagi-backport/linux25/net/ipv6/addrconf.c,v retrieving revision 1.1.1.9 retrieving revision 1.1.1.9.2.6 diff -u -r1.1.1.9 -r1.1.1.9.2.6 --- net/ipv6/addrconf.c 25 Mar 2003 04:33:45 -0000 1.1.1.9 +++ net/ipv6/addrconf.c 30 Mar 2003 18:51:29 -0000 1.1.1.9.2.6 @@ -30,6 +30,8 @@ * address validation timer. * YOSHIFUJI Hideaki @USAGI : Privacy Extensions (RFC3041) * support. + * Yuji SEKIYA @USAGI : Don't assign a same IPv6 + * address on a same interface. */ #include @@ -126,6 +128,8 @@ static void addrconf_rs_timer(unsigned long data); static void ipv6_ifa_notify(int event, struct inet6_ifaddr *ifa); +static int ipv6_chk_same_addr(const struct in6_addr *addr, struct net_device *dev); + static struct notifier_block *inet6addr_chain; struct ipv6_devconf ipv6_devconf = @@ -492,12 +496,23 @@ { struct inet6_ifaddr *ifa; int hash; + static spinlock_t lock = SPIN_LOCK_UNLOCKED; + + spin_lock_bh(&lock); + + /* Ignore adding duplicate addresses on an interface */ + if (ipv6_chk_same_addr(addr, idev->dev)) { + spin_unlock_bh(&lock); + ADBG(("ipv6_add_addr: already assigned\n")); + return ERR_PTR(-EEXIST); + } ifa = kmalloc(sizeof(struct inet6_ifaddr), GFP_ATOMIC); if (ifa == NULL) { + spin_unlock_bh(&lock); ADBG(("ipv6_add_addr: malloc failed\n")); - return NULL; + return ERR_PTR(-ENOBUFS); } memset(ifa, 0, sizeof(struct inet6_ifaddr)); @@ -513,8 +528,9 @@ read_lock(&addrconf_lock); if (idev->dead) { read_unlock(&addrconf_lock); + spin_unlock_bh(&lock); kfree(ifa); - return NULL; + return ERR_PTR(-ENODEV); /*XXX*/ } inet6_ifa_count++; @@ -551,6 +567,7 @@ in6_ifa_hold(ifa); write_unlock_bh(&idev->lock); read_unlock(&addrconf_lock); + spin_unlock_bh(&lock); notifier_call_chain(&inet6addr_chain,NETDEV_UP,ifa); @@ -697,7 +714,7 @@ ift = ipv6_count_addresses(idev) < IPV6_MAX_ADDRESSES ? ipv6_add_addr(idev, &addr, tmp_plen, ipv6_addr_type(&addr)&IPV6_ADDR_SCOPE_MASK, IFA_F_TEMPORARY) : 0; - if (!ift) { + if (IS_ERR(ift)) { in6_dev_put(idev); in6_ifa_put(ifp); printk(KERN_INFO @@ -928,6 +945,23 @@ return ifp != NULL; } +static +int ipv6_chk_same_addr(const struct in6_addr *addr, struct net_device *dev) +{ + struct inet6_ifaddr * ifp; + u8 hash = ipv6_addr_hash(addr); + + read_lock_bh(&addrconf_hash_lock); + for(ifp = inet6_addr_lst[hash]; ifp; ifp=ifp->lst_next) { + if (ipv6_addr_cmp(&ifp->addr, addr) == 0) { + if (dev == NULL || ifp->idev->dev == dev) + break; + } + } + read_unlock_bh(&addrconf_hash_lock); + return ifp != NULL; +} + struct inet6_ifaddr * ipv6_get_ifaddr(struct in6_addr *addr, struct net_device *dev) { struct inet6_ifaddr * ifp; @@ -1344,7 +1378,7 @@ ifp = ipv6_add_addr(in6_dev, &addr, pinfo->prefix_len, addr_type&IPV6_ADDR_SCOPE_MASK, 0); - if (ifp == NULL) { + if (IS_ERR(ifp)) { in6_dev_put(in6_dev); return; } @@ -1499,13 +1533,14 @@ scope = ipv6_addr_scope(pfx); - if ((ifp = ipv6_add_addr(idev, pfx, plen, scope, IFA_F_PERMANENT)) != NULL) { + ifp = ipv6_add_addr(idev, pfx, plen, scope, IFA_F_PERMANENT); + if (!IS_ERR(ifp)) { addrconf_dad_start(ifp); in6_ifa_put(ifp); return 0; } - return -ENOBUFS; + return PTR_ERR(ifp); } static int inet6_addr_del(int ifindex, struct in6_addr *pfx, int plen) @@ -1597,7 +1632,7 @@ if (addr.s6_addr32[3]) { ifp = ipv6_add_addr(idev, &addr, 128, scope, IFA_F_PERMANENT); - if (ifp) { + if (!IS_ERR(ifp)) { spin_lock_bh(&ifp->lock); ifp->flags &= ~IFA_F_TENTATIVE; spin_unlock_bh(&ifp->lock); @@ -1633,7 +1668,7 @@ ifp = ipv6_add_addr(idev, &addr, plen, flag, IFA_F_PERMANENT); - if (ifp) { + if (!IS_ERR(ifp)) { spin_lock_bh(&ifp->lock); ifp->flags &= ~IFA_F_TENTATIVE; spin_unlock_bh(&ifp->lock); @@ -1660,7 +1695,7 @@ } ifp = ipv6_add_addr(idev, &in6addr_loopback, 128, IFA_HOST, IFA_F_PERMANENT); - if (ifp) { + if (!IS_ERR(ifp)) { spin_lock_bh(&ifp->lock); ifp->flags &= ~IFA_F_TENTATIVE; spin_unlock_bh(&ifp->lock); @@ -1674,7 +1709,7 @@ struct inet6_ifaddr * ifp; ifp = ipv6_add_addr(idev, addr, 64, IFA_LINK, IFA_F_PERMANENT); - if (ifp) { + if (!IS_ERR(ifp)) { addrconf_dad_start(ifp); in6_ifa_put(ifp); } -- Hideaki YOSHIFUJI @ USAGI Project GPG FP: 9022 65EB 1ECF 3AD1 0BDF 80D8 4807 F894 E062 0EEA From greg.daley@eng.monash.edu.au Sun Mar 30 19:06:27 2003 Received: with ECARTIS (v1.0.0; list netdev); Sun, 30 Mar 2003 19:06:35 -0800 (PST) Received: from ALPHA8.ITS.MONASH.EDU.AU (alpha8.its.monash.edu.au [130.194.1.8]) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h2V35jq9012669 for ; Sun, 30 Mar 2003 19:06:26 -0800 Received: from thwack.its.monash.edu.au ([130.194.1.72]) by vaxh.its.monash.edu.au (PMDF V5.2-31 #39306) with ESMTP id <01KU6BHB58YW9BXEWS@vaxh.its.monash.edu.au> for netdev@oss.sgi.com; Mon, 31 Mar 2003 13:00:46 +1000 Received: from thwack.its.monash.edu.au (localhost [127.0.0.1]) by localhost (Postfix) with ESMTP id EEB7A12C019; Mon, 31 Mar 2003 13:00:42 +1000 (EST) Received: from eng.monash.edu.au (knuth.eng.monash.edu.au [130.194.252.110]) by thwack.its.monash.edu.au (Postfix) with ESMTP id 170F712C015; Mon, 31 Mar 2003 13:00:35 +1000 (EST) Date: Mon, 31 Mar 2003 13:00:34 +1000 From: Greg Daley Subject: Re: IP, MAC address duplication detection ? To: Seong Moon Cc: netdev@oss.sgi.com Reply-to: greg.daley@eng.monash.edu.au Message-id: <3E87AF52.8060209@eng.monash.edu.au> Organization: Monash University MIME-version: 1.0 Content-type: text/plain; format=flowed; charset=us-ascii Content-transfer-encoding: 7BIT User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.0.0) Gecko/20020529 X-Accept-Language: en, en-us References: <003401c2f402$3fbc4450$28acfe81@seong> X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) X-archive-position: 2107 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: greg.daley@eng.monash.edu.au Precedence: bulk X-list: netdev Content-Length: 680 Lines: 26 Hi Seong Moon, the IPv6 implementation does duplicate address detection. I'm not sure if this is similar to the ipv4 gratuitous ARP. Greg Seong Moon wrote: > Hi, there. > > In Linux box, How can I detect IP/MAC address duplication? > I'm using kernel-2.4.18 but the kernel does not seem to have > gratuitous arp implementation. Is it right? > > I know I can detect IP address duplication by arping program > But I want to implment a following mechanism. > > When the linux machine bootstraps or one of the nework interfaces > is assigned a MAC/IP address, the linux box can detect the duplication > of newly assigned MAC/IP address. How can I do this ? > > thanks. > > From Robert.Olsson@data.slu.se Mon Mar 31 09:15:54 2003 Received: with ECARTIS (v1.0.0; list netdev); Mon, 31 Mar 2003 09:16:01 -0800 (PST) Received: from robur.slu.se (robur.slu.se [130.238.98.12]) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h2VHFCq9018463 for ; Mon, 31 Mar 2003 09:15:54 -0800 Received: (from robert@localhost) by robur.slu.se (8.9.3/8.9.3) id TAA19385; Mon, 31 Mar 2003 19:14:36 +0200 From: Robert Olsson MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Message-ID: <16008.30588.705825.524303@robur.slu.se> Date: Mon, 31 Mar 2003 19:14:36 +0200 To: "Feldman, Scott" Cc: Robert Olsson , Jeff Garzik , netdev@oss.sgi.com Subject: RE: [Fwd: [E1000] NAPI re-insertion w/ changes] In-Reply-To: References: X-Mailer: VM 6.92 under Emacs 19.34.1 X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) X-archive-position: 2108 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: Robert.Olsson@data.slu.se Precedence: bulk X-list: netdev Content-Length: 1784 Lines: 59 Hello! A better approximation I think but probably not the last... Cheers. --ro --- linux/drivers/net/e1000/e1000_main.c.orig 2003-03-27 14:38:02.000000000 +0100 +++ linux/drivers/net/e1000/e1000_main.c 2003-03-31 17:56:05.000000000 +0200 @@ -1999,11 +1999,17 @@ mod_timer(&adapter->watchdog_timer, jiffies); } -#ifdef CONFIG_E1000_NAPI - /* Don't disable interrupts - rely on h/w interrupt - * moderation to keep interrupts low. netif_rx_schedule - * is NOP if already polling. */ - netif_rx_schedule(netdev); +#ifdef CONFIG_E1000_NAPI + if (netif_rx_schedule_prep(netdev)) { + + /* Disable interrupts and register for poll. The flush + of the posted write is intentionally left out. + */ + + atomic_inc(&adapter->irq_sem); + E1000_WRITE_REG(&adapter->hw, IMC, ~0); + __netif_rx_schedule(netdev); + } #else for(i = 0; i < E1000_MAX_INTR; i++) if(!e1000_clean_rx_irq(adapter) && @@ -2025,17 +2031,16 @@ int work_to_do = min(*budget, netdev->quota); int work_done = 0; - while(work_done < work_to_do) - if(!e1000_clean_rx_irq(adapter, &work_done, work_to_do) && - !e1000_clean_tx_irq(adapter)) - break; + e1000_clean_tx_irq(adapter); + e1000_clean_rx_irq(adapter, &work_done, work_to_do); *budget -= work_done; netdev->quota -= work_done; - if(work_done < work_to_do) + if(work_done < work_to_do) { netif_rx_complete(netdev); - + e1000_irq_enable(adapter); + } return (work_done >= work_to_do); } #endif From toml@us.ibm.com Mon Mar 31 10:07:11 2003 Received: with ECARTIS (v1.0.0; list netdev); Mon, 31 Mar 2003 10:07:22 -0800 (PST) Received: from e35.co.us.ibm.com (e35.co.us.ibm.com [32.97.110.133]) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h2VI6Nq9019322 for ; Mon, 31 Mar 2003 10:07:10 -0800 Received: from westrelay04.boulder.ibm.com (westrelay04.boulder.ibm.com [9.17.193.32]) by e35.co.us.ibm.com (8.12.9/8.12.2) with ESMTP id h2VI5aEf053566; Mon, 31 Mar 2003 13:05:36 -0500 Received: from tomlt2.austin.ibm.com (d03av02.boulder.ibm.com [9.17.193.82]) by westrelay04.boulder.ibm.com (8.12.8/NCO/VER6.5) with ESMTP id h2VI5ZAn124548; Mon, 31 Mar 2003 11:05:36 -0700 Subject: [PATCH] IPSec: Use of "sizeof" for header sizes From: Tom Lendacky To: netdev@oss.sgi.com Cc: davem@redhat.com, kuznet@ms2.inr.ac.ru, toml@us.ibm.com Content-Type: text/plain Content-Transfer-Encoding: 7bit X-Mailer: Ximian Evolution 1.0.8 (1.0.8-10) Date: 31 Mar 2003 12:07:08 -0600 Message-Id: <1049134030.1253.143.camel@tomlt2.tomloffice.austin.ibm.com> Mime-Version: 1.0 X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) X-archive-position: 2109 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: toml@us.ibm.com Precedence: bulk X-list: netdev Content-Length: 3228 Lines: 87 Below is a patch for your consideration eliminating the use of some constants in the AH and ESP routines for IPv4 and IPv6. I believe there was also a typo in a memcpy statement in net/ipv4/ah.c where iph->ihl was multiplied by 5 instead of 4. Also, the ESP files often use the constant 8 when calculating header length. This could be replaced a couple of ways: - use sizeof spi and sizeof seq_no - use sizeof ip(v6)_esp_hdr and substract the sizeof enc_data - remove enc_data[8] from the ip(v6)_esp_hdr. You could then use sizeof ip(v6)_esp_hdr, but you would then need to fix the references to enc_data in the code (3 refs in each version). I thought I'd get some comments or other suggestions on which approach would be best and most understandable/readable. Thanks, Tom diff -ur linux-2.5.66-orig/net/ipv4/ah.c linux-2.5.66/net/ipv4/ah.c --- linux-2.5.66-orig/net/ipv4/ah.c 2003-03-31 09:35:36.000000000 -0600 +++ linux-2.5.66/net/ipv4/ah.c 2003-03-31 09:22:47.000000000 -0600 @@ -18,7 +18,7 @@ static int ip_clear_mutable_options(struct iphdr *iph, u32 *daddr) { unsigned char * optptr = (unsigned char*)(iph+1); - int l = iph->ihl*4 - 20; + int l = iph->ihl*4 - sizeof(struct iphdr); int optlen; while (l > 0) { @@ -132,7 +132,7 @@ top_iph->frag_off = iph->frag_off; top_iph->daddr = iph->daddr; if (iph->ihl != 5) - memcpy(top_iph+1, iph+1, iph->ihl*5 - 20); + memcpy(top_iph+1, iph+1, iph->ihl*4 - sizeof(struct iphdr)); } ip_send_check(top_iph); @@ -288,7 +288,7 @@ x->props.header_len = XFRM_ALIGN8(ahp->icv_trunc_len + AH_HLEN_NOICV); if (x->props.mode) - x->props.header_len += 20; + x->props.header_len += sizeof(struct iphdr); x->data = ahp; return 0; diff -ur linux-2.5.66-orig/net/ipv4/esp.c linux-2.5.66/net/ipv4/esp.c --- linux-2.5.66-orig/net/ipv4/esp.c 2003-03-31 09:35:36.000000000 -0600 +++ linux-2.5.66/net/ipv4/esp.c 2003-03-31 09:22:47.000000000 -0600 @@ -367,7 +367,7 @@ crypto_cipher_setkey(esp->conf.tfm, esp->conf.key, esp->conf.key_len); x->props.header_len = 8 + esp->conf.ivlen; if (x->props.mode) - x->props.header_len += 20; + x->props.header_len += sizeof(struct iphdr); x->data = esp; x->props.trailer_len = esp4_get_max_size(x, 0) - x->props.header_len; return 0; diff -ur linux-2.5.66-orig/net/ipv6/ah6.c linux-2.5.66/net/ipv6/ah6.c --- linux-2.5.66-orig/net/ipv6/ah6.c 2003-03-31 09:37:20.000000000 -0600 +++ linux-2.5.66/net/ipv6/ah6.c 2003-03-31 09:22:47.000000000 -0600 @@ -287,7 +287,7 @@ x->props.header_len = XFRM_ALIGN8(ahp->icv_trunc_len + AH_HLEN_NOICV); if (x->props.mode) - x->props.header_len += 40; + x->props.header_len += sizeof(struct ipv6hdr); x->data = ahp; return 0; diff -ur linux-2.5.66-orig/net/ipv6/esp6.c linux-2.5.66/net/ipv6/esp6.c --- linux-2.5.66-orig/net/ipv6/esp6.c 2003-03-31 09:37:20.000000000 -0600 +++ linux-2.5.66/net/ipv6/esp6.c 2003-03-31 09:22:47.000000000 -0600 @@ -468,7 +468,7 @@ crypto_cipher_setkey(esp->conf.tfm, esp->conf.key, esp->conf.key_len); x->props.header_len = 8 + esp->conf.ivlen; if (x->props.mode) - x->props.header_len += 40; /* XXX ext hdr */ + x->props.header_len += sizeof(struct ipv6hdr); x->data = esp; return 0; From pb@bieringer.de Mon Mar 31 10:24:45 2003 Received: with ECARTIS (v1.0.0; list netdev); Mon, 31 Mar 2003 10:24:48 -0800 (PST) Received: from smtp2.aerasec.de (gromit.aerasec.de [195.226.187.57]) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h2VIO3q9020102 for ; Mon, 31 Mar 2003 10:24:45 -0800 Received: from localhost (localhost [127.0.0.1]) by smtp2.aerasec.de (Postfix) with SMTP id 0432113876; Mon, 31 Mar 2003 20:24:02 +0200 (CEST) X-AV-Checked: Mon Mar 31 20:24:02 2003 smtp2.aerasec.de Received: from [192.168.1.2] (pD950FE32.dip.t-dialin.net [217.80.254.50]) (using TLSv1 with cipher EDH-RSA-DES-CBC3-SHA (168/168 bits)) (Client did not present a certificate) by smtp2.aerasec.de (Postfix) with ESMTP id 5377513875; Mon, 31 Mar 2003 20:24:00 +0200 (CEST) Date: Mon, 31 Mar 2003 20:23:58 +0200 From: Peter Bieringer To: usagi-users@linux-ipv6.org, netdev@oss.sgi.com Cc: linux-kernel@vger.kernel.org, ds6-devel@deepspace6.net Subject: Re: (usagi-users 02296) IPv6 duplicate address bugfix Message-ID: <9360000.1049135038@worker.muc.bieringer.de> In-Reply-To: <20030330122705.GA18283@ferrara.linux.it> References: <20030330122705.GA18283@ferrara.linux.it> X-Mailer: Mulberry/3.0.3 (Linux/x86) X-URL: http://www.bieringer.de/pb/ X-OS: Linux MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Content-Disposition: inline X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) X-archive-position: 2111 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: pb@bieringer.de Precedence: bulk X-list: netdev Content-Length: 904 Lines: 32 Hi, just my 2 cents, I already saw, that newest USAGI snapshot include a fix. --On Sunday, March 30, 2003 02:27:05 PM +0200 Simone Piunno wrote: > When adding an IPv6 address to a given interface, I'm allowed to > add that address multiple time, e.g.: ... I didn't dig into any patch and also not into related drafts/RFCs, but one scenario should be catched I think - or to be discussed: Scenario: Address was already added by autoconfiguration on receiving advertisement (limited lifetime). Now the same address would be added manually (unlimited lifetime). What (should) happen? Mho: manual add is allowed, both addresses need to be listed. Peter -- Dr. Peter Bieringer http://www.bieringer.de/pb/ GPG/PGP Key 0x958F422D mailto: pb at bieringer dot de Deep Space 6 Co-Founder and Core Member http://www.deepspace6.net/ From davem@redhat.com Mon Mar 31 10:24:12 2003 Received: with ECARTIS (v1.0.0; list netdev); Mon, 31 Mar 2003 10:24:16 -0800 (PST) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h2VINVq9019980 for ; Mon, 31 Mar 2003 10:24:12 -0800 Received: from localhost (IDENT:davem@localhost.localdomain [127.0.0.1]) by pizda.ninka.net (8.9.3/8.9.3) with ESMTP id KAA16190; Mon, 31 Mar 2003 10:19:17 -0800 Date: Mon, 31 Mar 2003 10:19:17 -0800 (PST) Message-Id: <20030331.101917.85396623.davem@redhat.com> To: toml@us.ibm.com Cc: netdev@oss.sgi.com, kuznet@ms2.inr.ac.ru Subject: Re: [PATCH] IPSec: Use of "sizeof" for header sizes From: "David S. Miller" In-Reply-To: <1049134030.1253.143.camel@tomlt2.tomloffice.austin.ibm.com> References: <1049134030.1253.143.camel@tomlt2.tomloffice.austin.ibm.com> X-FalunGong: Information control. X-Mailer: Mew version 2.1 on Emacs 21.1 / Mule 5.0 (SAKAKI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) X-archive-position: 2110 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev Content-Length: 606 Lines: 15 From: Tom Lendacky Date: 31 Mar 2003 12:07:08 -0600 Below is a patch for your consideration eliminating the use of some constants in the AH and ESP routines for IPv4 and IPv6. I believe there was also a typo in a memcpy statement in net/ipv4/ah.c where iph->ihl was multiplied by 5 instead of 4. Thanks a lot Tom, Applied. Looks like not too many people have been testing IPSEC links with IP options :-) - use sizeof ip(v6)_esp_hdr and substract the sizeof enc_data This sounds the best. It's a bit much to type, but it's the most descriptive expression. From davem@redhat.com Mon Mar 31 11:11:22 2003 Received: with ECARTIS (v1.0.0; list netdev); Mon, 31 Mar 2003 11:11:27 -0800 (PST) Received: from pizda.ninka.net (IDENT:root@pizda.ninka.net [216.101.162.242]) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h2VJAgq9021766 for ; Mon, 31 Mar 2003 11:11:22 -0800 Received: from localhost (IDENT:davem@localhost.localdomain [127.0.0.1]) by pizda.ninka.net (8.9.3/8.9.3) with ESMTP id LAA16306; Mon, 31 Mar 2003 11:05:52 -0800 Date: Mon, 31 Mar 2003 11:05:51 -0800 (PST) Message-Id: <20030331.110551.14104246.davem@redhat.com> To: yoshfuji@linux-ipv6.org Cc: kuznet@ms2.inr.ac.ru, netdev@oss.sgi.com, linux-kernel@vger.kernel.org, usagi@linux-ipv6.org, pioppo@ferrara.linux.it Subject: Re: [PATCH] IPv6: Don't assign a same IPv6 address on a same interface From: "David S. Miller" In-Reply-To: <20030331.103451.118020141.yoshfuji@linux-ipv6.org> References: <20030330163656.GA18645@ferrara.linux.it> <20030331.033524.114862210.yoshfuji@linux-ipv6.org> <20030331.103451.118020141.yoshfuji@linux-ipv6.org> X-FalunGong: Information control. X-Mailer: Mew version 2.1 on Emacs 21.1 / Mule 5.0 (SAKAKI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=iso-2022-jp Content-Transfer-Encoding: 7bit X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) X-archive-position: 2112 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@redhat.com Precedence: bulk X-list: netdev Content-Length: 1211 Lines: 26 From: YOSHIFUJI Hideaki / $B5HF#1QL@(B Date: Mon, 31 Mar 2003 10:34:51 +0900 (JST) In article <20030331.033524.114862210.yoshfuji@linux-ipv6.org> (at Mon, 31 Mar 2003 03:35:24 +0900 (JST)), YOSHIFUJI Hideaki / $B5HF#1QL@(B says: > In article <20030330163656.GA18645@ferrara.linux.it> (at Sun, 30 Mar 2003 18:36:56 +0200), Simone Piunno says: > > > - locking inside ipv6_add_addr() is simpler and more linear but > > semantically wrong because you're unable to tell the user why his > > "ip addr add" failed. E.g. you answer ENOBUFS instead of EEXIST. > > We don't want to create duplicate address in any case. > ipv6_add_addr() IS right place. > And, we can return error code by using IS_ERR() etc. > I'll fix this. Here's the revised patch. Applied to both 2.4.x and 2.5.x. BTW, 2.4.x patch failed in two spots, one was author comment which I easily fixed, second was in privacy code which I did not apply yet to 2.4.x (I fixed this too, don't worry). I do not want to put privacy code into 2.4.x until crypto is there. I plan to put crypto lib into 2.4.22-pre1. From pioppo@ferrara.linux.it Mon Mar 31 18:46:14 2003 Received: with ECARTIS (v1.0.0; list netdev); Mon, 31 Mar 2003 18:46:19 -0800 (PST) Received: from smtp4.cp.tin.it (vsmtp4.tin.it [212.216.176.224]) by oss.sgi.com (8.12.8/8.12.5) with SMTP id h312kBq9023814 for ; Mon, 31 Mar 2003 18:46:14 -0800 Received: from abulafia.casa (62.211.159.14) by smtp4.cp.tin.it (6.5.033) id 3E71D115006F81B0; Mon, 31 Mar 2003 20:58:08 +0200 Date: Mon, 31 Mar 2003 20:56:48 +0200 From: Simone Piunno To: Peter Bieringer Cc: usagi-users@linux-ipv6.org, netdev@oss.sgi.com, ds6-devel@deepspace6.net, linux-kernel@vger.kernel.org Subject: Re: [ds6-devel] Re: (usagi-users 02296) IPv6 duplicate address bugfix Message-ID: <20030331185648.GA3928@ferrara.linux.it> References: <20030330122705.GA18283@ferrara.linux.it> <9360000.1049135038@worker.muc.bieringer.de> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <9360000.1049135038@worker.muc.bieringer.de> User-Agent: Mutt/1.4i Organization: Ferrara LUG X-Operating-System: Linux 2.4.20-skas3 X-Message: GnuPG/PGP5 are welcome X-Key-ID: 860314FC/C09E842C X-Key-FP: 9C15F0D3E3093593AC952C92A0CD52B4860314FC X-Key-URL: http://members.ferrara.linux.it/pioppo/mykey.asc X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) X-archive-position: 2114 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: pioppo@ferrara.linux.it Precedence: bulk X-list: netdev On Mon, Mar 31, 2003 at 08:23:58PM +0200, Peter Bieringer wrote: > Address was already added by autoconfiguration on receiving advertisement > (limited lifetime). Now the same address would be added manually (unlimited > lifetime). > > What (should) happen? > > Mho: manual add is allowed, both addresses need to be listed. I'd prefer this variant: manual add is allowed and overwrites the autoconfigured address. -- Simone Piunno -- http://members.ferrara.linux.it/pioppo .------- Adde parvum parvo magnus acervus erit -------. Ferrara Linux Users Group - http://www.ferrara.linux.it Deep Space 6, IPv6 on Linux - http://www.deepspace6.net GNU Mailman, Mailing List Manager - http://www.list.org `-------------------------------------------------------'